pyslk.core package#
Submodules#
pyslk.core.gen_queries module#
- pyslk.core.gen_queries.gen_file_query(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) str #
Generates a search query that searches for the listed resources
A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:
a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)
a filename without full path (e.g. INDEX.txt)
a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)
a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)
Details are given in the slk_helpers documentation at https://docs.dkrz.de
- Parameters:
resources (
str
orlist
orset
orPath
) – list of resources to be searched forrecursive (
bool
) – do recursive search in the namespacescached_only (
bool
) – do recursive search in the namespacesnot_cached (
bool
) – do recursive search in the namespacestape_barcodes (
list[str]
) – do recursive search in the namespaces
- Returns:
generated search query
- Return type:
str
- pyslk.core.gen_queries.gen_file_query_as_dict(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) dict #
Generates a search query that searches for the listed resources
A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:
a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)
a filename without full path (e.g. INDEX.txt)
a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)
a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)
Details are given in the slk_helpers documentation at https://docs.dkrz.de
- Parameters:
resources (
str
orlist
orset
orPath
) – list of resources to be searched forrecursive (
bool
) – do recursive search in the namespacescached_only (
bool
) – do recursive search in the namespacesnot_cached (
bool
) – do recursive search in the namespacestape_barcodes (
list[str]
) – do recursive search in the namespaces
- Returns:
generated search query
- Return type:
dict
- pyslk.core.gen_queries.gen_search_query(key_value_pairs: str | list[str] | set[str], recursive: bool = False, search_query: str | None = None) str #
Generates a search query that searches for the listed resources
A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:
a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)
a filename without full path (e.g. INDEX.txt)
a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)
a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)
Details are given in the slk_helpers documentation at https://docs.dkrz.de
- Parameters:
key_value_pairs (
str
orlist
orset
) – list of key-value pairs connected via an operatorrecursive (
bool
) – do recursive search in the namespacessearch_query (
str
) – an existing search query to be extended
- Returns:
generated search query
- Return type:
str
pyslk.core.gfbt module#
- pyslk.core.gfbt.count_tapes(resource_path: str | list | Path | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False) dict[str, int] #
Count number of tapes onto which provided files are stored; distinguishes between multi-tape and single-tape files
- Parameters:
resource_path (
str
orpath-like
orlist
) – a resource path (str or Path) or multiple resource paths (in a list)search_id (
int
,str
) – id of a searchsearch_query (
str
) – a search queryrecursive (
bool
) – set whether resource should be evaluated recursively or not
- Returns:
dictionary containing the two tape counts
- Return type:
dict
- pyslk.core.gfbt.count_tapes_with_multi_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int #
Count number of tapes onto which provided files are stored which are split onto multiple tapes per file
Internally calls
pyslk.count_tapes()
- Parameters:
resource_path (
str
orint
orpath-like
) – a resource pathsearch_id (
int
orstr
) – id of a searchsearch_query (
str
) – a search query
- Returns:
number of tapes
- Return type:
int
- pyslk.core.gfbt.count_tapes_with_single_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int #
Count number of tapes onto which provided files are stored which are not split onto multiple tapes per file
Internally calls
pyslk.count_tapes()
- Parameters:
resource_path (
str
orint
orpath-like
) – a resource pathsearch_id (
int
orstr
) – id of a searchsearch_query (
str
) – a search query
- Returns:
number of tapes
- Return type:
int
- pyslk.core.gfbt.group_files_by_tape(resource_path: Path | str | list | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False, max_tape_number_per_search: int = -1, run_search_query: bool = False) list[dict] #
Group files by tape id.
Group a list of files by their tape id. Has not all arguments of the slk_helpers group_files_by_tape cli call. Please us
pyslk.count_tapes()
to count the number of tapes onto which files are stored on.- Parameters:
resource_path (
str
,list
,Path
) – list of files or a namespaces with files that should be grouped.search_id (
int
,str
) – id of a searchsearch_query (
str
) – a search queryrecursive (
bool
) – do recursive search in the namespacesmax_tape_number_per_search (
int
) – number of tapes per search; if ‘-1’ => the parameter is not setrun_search_query (
bool
) – generate and run (a) search query strings instead of the lists of files per tape and print the search i, Default: False
- Returns:
A list of dictionaries containing group and tape info.
- Return type:
list[dict]
See also
Examples
>>> import pyslk as slk >>> slk.group_files_by_tape(["/test/test3/ingest_01_102", "/test/test3/ingest_01_339"]) [{'id': -1, 'location': 'cache', 'label': '', 'status': '', 'file_count': 2, 'files': ['/test/test3/ingest_01_102', '/test/test3/ingest_01_339'], 'search_query': '{"$and":[{"path":{"$gte":"/test/test3","$max_depth":1}}, {"resources.name":{"$regex":"ingest_01_102|ingest_01_339"}}]}', 'search_id': 416837}]
pyslk.core.metadata module#
- pyslk.core.metadata.get_metadata(resource: str | Path, print_hidden: bool = False, print_raw_values: bool = False) dict[str, Union[str, int, float, dict]] | None #
Get metadata
- Parameters:
resource (
str
orPath
) – resource (full path)print_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]print_raw_values (
bool
) – print metadata values without trying to convert them to int/float/dict [default: False]
- Returns:
dictionary with the metadata
- Return type:
dict
orNone
- pyslk.core.metadata.get_tag(path_or_id: str | int, recursive: bool = False) dict | None #
Apply metadata to the namespace and child resources
- Parameters:
path_or_id (
str
orint
) – search id or gns path of resources to retrieverecursive (
bool
) – use the -R flag to tag recursively, Default: False
- Returns:
metadata of the target files
- Return type:
dict
orNone
- pyslk.core.metadata.hsm2json(resources: str | Path | list | None = None, search_id: int = -1, recursive: bool = False, outfile: str | Path | None = None, restart_file: str | Path | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]] #
Extract metadata from HSM file(s) and return them in JSON structure
- Parameters:
resources (
str
orPath
orlist
) – list of resources to be searched forsearch_id (
int
) – id of a searchrecursive (
bool
) – export metadata from all files in gns_path recursivelyoutfile (
str
orPath
orNone
) – Write the output into a file instead to the stdoutrestart_file (
str
orPath
orNone
) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
orNone
) – import only metadata fields of listed schemata; if str: comma-separated list without spaceswrite_json_lines (
bool
= False) – write JSON-lines instead of JSONwrite_mode (
str
= None) – applies when ‘output’ is set; possible values: OVERWRITE, ERRORinstant_metadata_record_output (
bool
) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)print_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]
- Returns:
dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (JSON file); either ‘metadata’ or ‘file’ is none depending on the value of input argument ‘outfile’
- Return type:
dict
- pyslk.core.metadata.hsm2json_dict(resources: str | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]] #
Extract metadata from HSM file(s) and return them in JSON structure
- Parameters:
resources (
str
orlist
) – list of resources to be searched forsearch_id (
int
) – id of a searchrecursive (
bool
) – export metadata from all files in gns_path recursivelyrestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spacesprint_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]
- Returns:
dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (None)
- Return type:
dict
- pyslk.core.metadata.hsm2json_file(outfile: str, resources: str | Path | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) None #
Extract metadata from HSM file(s) and return them in JSON structure
- Parameters:
outfile (
str
orPath
) – Write the output into a file instead to the stdoutresources (
str
orPath
orlist
) – list of resources to be searched forsearch_id (
int
) – id of a searchrecursive (
bool
) – export metadata from all files in gns_path recursivelyrestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaceswrite_json_lines (
bool
= False) – write JSON-lines instead of JSONwrite_mode (
str
= None) – applies when ‘output’ is set; possible values: OVERWRITE, ERRORinstant_metadata_record_output (
bool
) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)print_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]
- Returns:
nothing; throws an error if writing failed
- Return type:
None
- pyslk.core.metadata.json2hsm(json_file: str | None = None, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False, json_string: str | None = None) dict #
Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_file (
str
) – JSON input file containing metadatarestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been readjson_string (
str
) – provide a json string instead of a json file; incompatible with json_file
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.metadata.json_dict2hsm(json_dict: dict, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict #
Read metadata from JSON dictionary and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_dict (
dict
) – a dictionary representing JSONrestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.metadata.json_file2hsm(json_file: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict #
Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_file (
str
) – JSON input file containing metadatarestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.metadata.json_str2hsm(json_string: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict #
Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_string (
str
) – JSON string containing metadatarestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.metadata.set_tag(path_or_id: str | int, metadata: dict, recursive: bool = False) dict | None #
Apply metadata to the namespace and child resources
- Parameters:
path_or_id (
str
orint
) – search id or gns path of resources to retrievemetadata (
dict
) – dict that holds as keys “[metadata schema].[field]” and as values the metadata valuesrecursive (
bool
) – use the -R flag to tag recursively, Default: False
- Returns:
new metadata of the target files
- Return type:
dict
orNone
pyslk.core.resource module#
- pyslk.core.resource.access_hsm(resource: list[str] | list[pathlib.Path] | str | Path, mode: int) bool | list[bool] #
- pyslk.core.resource.arch_size(path_or_id: str | int, unit: str = 'B') dict[str, Union[str, float, int]] #
Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes
- Parameters:
path_or_id (
str
) – search id or gns pathunit (
str
) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B
- Returns:
archive size, in key “value” contains size without unit and key “unit” contains unit
- Return type:
dict
- pyslk.core.resource.chgrp(gns_path: str, group: str | int, recursive: bool = False) dict | None #
Change the group of a resource or namespace
- Parameters:
gns_path (
str
) – namespace or file (full GNS path)group (
str
orint
) – new group of a file (group name or gid)recursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
dict with stdout/stderr, exit code and lists of files with correct and incorrect group; None if gns_path does not exist
- Return type:
dict
orNone
- pyslk.core.resource.chmod(gns_path: str | list, mode: str | int, recursive: bool = False) bool | None #
Change the access mode of a resource or namespace
- Parameters:
gns_path (
str
orlist
) – namespace or file (full GNS path); can be file listmode (
str
orint
) – new mode/permissions of a file (as known from bash’s chmod)recursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
True if successful, None if target does not exist, PySlkException if fails
- Return type:
bool
orNone
- pyslk.core.resource.chown(gns_path: [<class 'str'>, <class 'pathlib.Path'>], owner: str | int, recursive: bool = False) dict | None #
Change the owner of a resource or namespace
- Parameters:
gns_path (
str
orPath
) – namespace or file (full GNS path)owner (
str
orint
) – new owner of a file (username or uid)recursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
dict of Paths of the modified files ‘PATH: TRUE-IF-OWNER-CORRECT’; None if gns_path does not exist
- Return type:
dict
orNone
- pyslk.core.resource.delete(gns_path: str | Path | list, recursive: bool = False) None #
- Soft delete a namespace (optionally all child objects of a non-empty
namespace) or a specific file
- Parameters:
gns_path (
str
or list orPath
) – namespace or file (full GNS path); can be file listrecursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
nothing is returned (void function)
- Return type:
None
- pyslk.core.resource.get_checksum(resource: str | Path, checksum_type: str | None = None) dict[str, str] | None #
Get a checksum of a resource
- Parameters:
resource (
str
orPath
) – resource (full path)checksum_type (
str
) – checksum_type (possible values: None, “sha512”, “adler32”; None => print all)
- Returns:
dictionary with checksum type as key(s) and checksum(s) as value(s); empty keys if no checksum; ‘None’ if resource does not exist
- Return type:
dict
orNone
- pyslk.core.resource.get_resource_id(resource_path: str | Path) int | None #
returns resource_id to a resource path
- Parameters:
resource_path (
str
orpath-like
) – namespace or resource- Returns:
resource_id if the file exists; None otherwise
- Return type:
int
orNone
- pyslk.core.resource.get_resource_permissions(resource: str | int | Path | None = None, as_octal_number: bool = False) str | bool | None #
Get path for a resource id
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathas_octal_number (
bool
) – Do not return the permissions as combination of x/w/r/- but as three digit octal number
- Returns:
permissions string; False if resource does not exist
- Return type:
str
orbool
orNone
- pyslk.core.resource.get_resource_size(resource: str | int | Path, recursive: bool = False) int | None #
Returns file size in byte
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathrecursive (bool) – use the -R to calculate size recursively
- Returns:
size in byte; None if resource does not exist
- Return type:
int
orNone
- pyslk.core.resource.get_resource_tape(resource_path: str | Path) dict[int, str] | None #
returns tape on which resource with given path is stored on
- Parameters:
resource_path (
str
orpath-like
) – namespace or resource- Returns:
tape(s) on which a file is/are stored on as dict; None otherwise
- Return type:
dict[int, str]
orNone
- pyslk.core.resource.has_no_flag_partial(resource: str | int | Path | list | None = None, search_id: str | int | None = None) bool #
Check if whether file(s) is/are flagged as partial; return True/False
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
True if no files is flagged; False otherwise
- Return type:
bool
- pyslk.core.resource.has_no_flag_partial_details(resource: str | int | Path | list | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]] #
Check if whether file(s) is/are flagged as partial; returns dict with keys ‘flag_partial’ and ‘no_flag_partial’
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
dictionary with two keys ‘flag_partial’ and ‘no_flag_partial’ which each have a list of files as value
- Return type:
dict[str, list[Path]]
- pyslk.core.resource.makedirs(gns_path: str | Path, exist_ok: bool = False) int #
Create a directory like ‘mkdir()’ but create parent directories recursively, if they do not exist
If exist_ok is False (the default), a FileExistsError is raised if the target directory already exists.
- Parameters:
gns_path (
str
orPath
) – gns path to createexist_ok (
bool
) – throw no error if folder already exists (like ‘mkdir -p’)
- Returns:
namespace/resource id of the created namespace
- Return type:
int
See also
- pyslk.core.resource.mkdir(gns_path: str | Path) int #
Create a directory
If the directory already exists,
FileExistsError
is raised. If a parent directory in the path does not exist,FileNotFoundError
is raised.- Parameters:
gns_path (
str
orPath
) – gns path to create- Returns:
namespace/resource id of the created namespace
- Return type:
int
See also
- pyslk.core.resource.move(src_path: str, dst_gns: str, no_overwrite: bool) int #
Move namespaces/files from one parent folder to another; renaming is not possible
- Parameters:
src_path (
str
) – namespace or file (full GNS path)dst_gns (
str
) – new parent namespaceno_overwrite (
bool
) – do not overwrite target file if it exists
- Returns:
return resource id of the moved resource
- Return type:
int
- pyslk.core.resource.rename(old_name: str, new_name: str) int #
Rename a folder or file; moving is not possible
- Parameters:
old_name (
str
) – folder or file name (full GNS path)new_name (
str
) – new name (only name; no full GNS path)
- Returns:
return resource id of the renamed resource
- Return type:
int
pyslk.core.resource_extras module#
- pyslk.core.resource_extras.get_resource_type(resource: str | int | Path) str | None #
Get type of resource
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
type of the resource; None if resource does not exist
- Return type:
str
orNone
- pyslk.core.resource_extras.is_file(resource: str | int | Path | None = None) bool | None #
Returns True if resource is a file
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
True if resource is a file; else False
- Return type:
str
orNone
- pyslk.core.resource_extras.is_namespace(resource: str | int | Path | None = None) bool | None #
Returns True if resource is a namespace
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
True if resource is a namespace; else False
- Return type:
str
orNone
- pyslk.core.resource_extras.resource_exists(resource: str | Path | int) bool #
Check if resource exists and return True/False
- Parameters:
resource (
str
orpath-like
) – namespace or resource- Returns:
True if file exists; False otherwise
- Return type:
bool
pyslk.core.storage module#
- pyslk.core.storage.get_rcrs(resource: str | int | Path) dict | None #
prints resource content record (rcr) information
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
storage information when exists; None otherwise
- Return type:
dict
- pyslk.core.storage.get_storage_information(resource: str | int | Path) dict | None #
prints resource content record (rcr) information
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
True if tape exists; False otherwise
- Return type:
bool
- pyslk.core.storage.get_tape_barcode(tape_id: int | str) str | None #
return tape barcode for provided tape id
- Parameters:
tape_id (
int
orstr
) – id of a tape in the tape library- Returns:
True if tape exists; False otherwise
- Return type:
bool
- pyslk.core.storage.get_tape_id(tape_barcode: str) int | None #
return tape id for provided tape barcode
- Parameters:
tape_barcode (
str
) – barcode of a tape in the tape library- Returns:
tape id; None otherwise
- Return type:
int
- pyslk.core.storage.is_cached(resource: str | int | Path | None = None, search_id: str | int | None = None) bool #
Check if whether file(s) is/are in HSM cache or not; returns True/False
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
True if all files are in cache; False otherwise
- Return type:
bool
- pyslk.core.storage.is_cached_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]] #
Check if whether file(s) is/are in HSM cache or not; returns dict with keys ‘cached’ and ‘not_cached’
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
dictionary with two keys ‘cached’ and ‘not_cached’ which each have a list of files as value
- Return type:
dict[str, list[Path]]
- pyslk.core.storage.is_on_tape(resource: str | int | Path | None = None, search_id: str | int | None = None) bool #
Check if whether file(s) is/are stored on tape; returns True/False
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
True if all files are stored on tape; False otherwise
- Return type:
bool
- pyslk.core.storage.is_on_tape_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]] #
Check if whether file(s) is/are stored on tape or not; returns dict with keys ‘on_tape’ and ‘not_on_tape’
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
dictionary with two keys ‘on_tape’ and ‘not_on_tape’ which each have a list of files as value
- Return type:
dict[str, list[Path]]
- pyslk.core.storage.is_tape_available(tape: int) bool | None #
Check if tape is available
- Parameters:
tape (
int
orstr
) – id or barcode of a tape in the tape library- Returns:
True if tape is available for recalls/retrievals; else False; None if tape does not exist
- Return type:
bool
orNone
- pyslk.core.storage.tape_exists(tape: int | str) bool #
Check if tape exists
- Parameters:
tape (
int
orstr
) – id or barcode of a tape in the tape library- Returns:
True if tape exists; False otherwise
- Return type:
bool
- pyslk.core.storage.tape_status(tape: int | str, details: bool = False) str | None #
Check the status of a tape
- Parameters:
tape (
int
orstr
) – id or barcode of a tape in the tape librarydetails (
bool
) – print a more detailed description of the retrieval status
- Returns:
status of the tape; None if tape does not exist
- Return type:
str
orNone
Module contents#
- pyslk.core.access_hsm(resource: list[str] | list[pathlib.Path] | str | Path, mode: int) bool | list[bool] #
- pyslk.core.arch_size(path_or_id: str | int, unit: str = 'B') dict[str, Union[str, float, int]] #
Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes
- Parameters:
path_or_id (
str
) – search id or gns pathunit (
str
) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B
- Returns:
archive size, in key “value” contains size without unit and key “unit” contains unit
- Return type:
dict
- pyslk.core.chgrp(gns_path: str, group: str | int, recursive: bool = False) dict | None #
Change the group of a resource or namespace
- Parameters:
gns_path (
str
) – namespace or file (full GNS path)group (
str
orint
) – new group of a file (group name or gid)recursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
dict with stdout/stderr, exit code and lists of files with correct and incorrect group; None if gns_path does not exist
- Return type:
dict
orNone
- pyslk.core.chmod(gns_path: str | list, mode: str | int, recursive: bool = False) bool | None #
Change the access mode of a resource or namespace
- Parameters:
gns_path (
str
orlist
) – namespace or file (full GNS path); can be file listmode (
str
orint
) – new mode/permissions of a file (as known from bash’s chmod)recursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
True if successful, None if target does not exist, PySlkException if fails
- Return type:
bool
orNone
- pyslk.core.chown(gns_path: [<class 'str'>, <class 'pathlib.Path'>], owner: str | int, recursive: bool = False) dict | None #
Change the owner of a resource or namespace
- Parameters:
gns_path (
str
orPath
) – namespace or file (full GNS path)owner (
str
orint
) – new owner of a file (username or uid)recursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
dict of Paths of the modified files ‘PATH: TRUE-IF-OWNER-CORRECT’; None if gns_path does not exist
- Return type:
dict
orNone
- pyslk.core.count_tapes(resource_path: str | list | Path | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False) dict[str, int] #
Count number of tapes onto which provided files are stored; distinguishes between multi-tape and single-tape files
- Parameters:
resource_path (
str
orpath-like
orlist
) – a resource path (str or Path) or multiple resource paths (in a list)search_id (
int
,str
) – id of a searchsearch_query (
str
) – a search queryrecursive (
bool
) – set whether resource should be evaluated recursively or not
- Returns:
dictionary containing the two tape counts
- Return type:
dict
- pyslk.core.count_tapes_with_multi_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int #
Count number of tapes onto which provided files are stored which are split onto multiple tapes per file
Internally calls
pyslk.count_tapes()
- Parameters:
resource_path (
str
orint
orpath-like
) – a resource pathsearch_id (
int
orstr
) – id of a searchsearch_query (
str
) – a search query
- Returns:
number of tapes
- Return type:
int
- pyslk.core.count_tapes_with_single_tape_files(resource_path: str | int | Path | None = None, search_id: str | int | None = None, search_query: str | None = None) int #
Count number of tapes onto which provided files are stored which are not split onto multiple tapes per file
Internally calls
pyslk.count_tapes()
- Parameters:
resource_path (
str
orint
orpath-like
) – a resource pathsearch_id (
int
orstr
) – id of a searchsearch_query (
str
) – a search query
- Returns:
number of tapes
- Return type:
int
- pyslk.core.delete(gns_path: str | Path | list, recursive: bool = False) None #
- Soft delete a namespace (optionally all child objects of a non-empty
namespace) or a specific file
- Parameters:
gns_path (
str
or list orPath
) – namespace or file (full GNS path); can be file listrecursive (
bool
) – use the -R flag to delete recursively, Default: False
- Returns:
nothing is returned (void function)
- Return type:
None
- pyslk.core.gen_file_query(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) str #
Generates a search query that searches for the listed resources
A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:
a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)
a filename without full path (e.g. INDEX.txt)
a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)
a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)
Details are given in the slk_helpers documentation at https://docs.dkrz.de
- Parameters:
resources (
str
orlist
orset
orPath
) – list of resources to be searched forrecursive (
bool
) – do recursive search in the namespacescached_only (
bool
) – do recursive search in the namespacesnot_cached (
bool
) – do recursive search in the namespacestape_barcodes (
list[str]
) – do recursive search in the namespaces
- Returns:
generated search query
- Return type:
str
- pyslk.core.gen_file_query_as_dict(resources: str | list[str] | set[str] | Path | list[pathlib.Path] | set[pathlib.Path], recursive: bool = False, cached_only: bool = False, not_cached: bool = False, tape_barcodes: list[str] | str | None = None) dict #
Generates a search query that searches for the listed resources
A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:
a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)
a filename without full path (e.g. INDEX.txt)
a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)
a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)
Details are given in the slk_helpers documentation at https://docs.dkrz.de
- Parameters:
resources (
str
orlist
orset
orPath
) – list of resources to be searched forrecursive (
bool
) – do recursive search in the namespacescached_only (
bool
) – do recursive search in the namespacesnot_cached (
bool
) – do recursive search in the namespacestape_barcodes (
list[str]
) – do recursive search in the namespaces
- Returns:
generated search query
- Return type:
dict
- pyslk.core.gen_search_query(key_value_pairs: str | list[str] | set[str], recursive: bool = False, search_query: str | None = None) str #
Generates a search query that searches for the listed resources
A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:
a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)
a filename without full path (e.g. INDEX.txt)
a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)
a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)
Details are given in the slk_helpers documentation at https://docs.dkrz.de
- Parameters:
key_value_pairs (
str
orlist
orset
) – list of key-value pairs connected via an operatorrecursive (
bool
) – do recursive search in the namespacessearch_query (
str
) – an existing search query to be extended
- Returns:
generated search query
- Return type:
str
- pyslk.core.get_checksum(resource: str | Path, checksum_type: str | None = None) dict[str, str] | None #
Get a checksum of a resource
- Parameters:
resource (
str
orPath
) – resource (full path)checksum_type (
str
) – checksum_type (possible values: None, “sha512”, “adler32”; None => print all)
- Returns:
dictionary with checksum type as key(s) and checksum(s) as value(s); empty keys if no checksum; ‘None’ if resource does not exist
- Return type:
dict
orNone
- pyslk.core.get_metadata(resource: str | Path, print_hidden: bool = False, print_raw_values: bool = False) dict[str, Union[str, int, float, dict]] | None #
Get metadata
- Parameters:
resource (
str
orPath
) – resource (full path)print_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]print_raw_values (
bool
) – print metadata values without trying to convert them to int/float/dict [default: False]
- Returns:
dictionary with the metadata
- Return type:
dict
orNone
- pyslk.core.get_rcrs(resource: str | int | Path) dict | None #
prints resource content record (rcr) information
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
storage information when exists; None otherwise
- Return type:
dict
- pyslk.core.get_resource_id(resource_path: str | Path) int | None #
returns resource_id to a resource path
- Parameters:
resource_path (
str
orpath-like
) – namespace or resource- Returns:
resource_id if the file exists; None otherwise
- Return type:
int
orNone
- pyslk.core.get_resource_permissions(resource: str | int | Path | None = None, as_octal_number: bool = False) str | bool | None #
Get path for a resource id
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathas_octal_number (
bool
) – Do not return the permissions as combination of x/w/r/- but as three digit octal number
- Returns:
permissions string; False if resource does not exist
- Return type:
str
orbool
orNone
- pyslk.core.get_resource_size(resource: str | int | Path, recursive: bool = False) int | None #
Returns file size in byte
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathrecursive (bool) – use the -R to calculate size recursively
- Returns:
size in byte; None if resource does not exist
- Return type:
int
orNone
- pyslk.core.get_resource_tape(resource_path: str | Path) dict[int, str] | None #
returns tape on which resource with given path is stored on
- Parameters:
resource_path (
str
orpath-like
) – namespace or resource- Returns:
tape(s) on which a file is/are stored on as dict; None otherwise
- Return type:
dict[int, str]
orNone
- pyslk.core.get_resource_type(resource: str | int | Path) str | None #
Get type of resource
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
type of the resource; None if resource does not exist
- Return type:
str
orNone
- pyslk.core.get_storage_information(resource: str | int | Path) dict | None #
prints resource content record (rcr) information
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
True if tape exists; False otherwise
- Return type:
bool
- pyslk.core.get_tag(path_or_id: str | int, recursive: bool = False) dict | None #
Apply metadata to the namespace and child resources
- Parameters:
path_or_id (
str
orint
) – search id or gns path of resources to retrieverecursive (
bool
) – use the -R flag to tag recursively, Default: False
- Returns:
metadata of the target files
- Return type:
dict
orNone
- pyslk.core.get_tape_barcode(tape_id: int | str) str | None #
return tape barcode for provided tape id
- Parameters:
tape_id (
int
orstr
) – id of a tape in the tape library- Returns:
True if tape exists; False otherwise
- Return type:
bool
- pyslk.core.get_tape_id(tape_barcode: str) int | None #
return tape id for provided tape barcode
- Parameters:
tape_barcode (
str
) – barcode of a tape in the tape library- Returns:
tape id; None otherwise
- Return type:
int
- pyslk.core.group_files_by_tape(resource_path: Path | str | list | None = None, search_id: str | int | None = None, search_query: str | None = None, recursive: bool = False, max_tape_number_per_search: int = -1, run_search_query: bool = False) list[dict] #
Group files by tape id.
Group a list of files by their tape id. Has not all arguments of the slk_helpers group_files_by_tape cli call. Please us
pyslk.count_tapes()
to count the number of tapes onto which files are stored on.- Parameters:
resource_path (
str
,list
,Path
) – list of files or a namespaces with files that should be grouped.search_id (
int
,str
) – id of a searchsearch_query (
str
) – a search queryrecursive (
bool
) – do recursive search in the namespacesmax_tape_number_per_search (
int
) – number of tapes per search; if ‘-1’ => the parameter is not setrun_search_query (
bool
) – generate and run (a) search query strings instead of the lists of files per tape and print the search i, Default: False
- Returns:
A list of dictionaries containing group and tape info.
- Return type:
list[dict]
See also
Examples
>>> import pyslk as slk >>> slk.group_files_by_tape(["/test/test3/ingest_01_102", "/test/test3/ingest_01_339"]) [{'id': -1, 'location': 'cache', 'label': '', 'status': '', 'file_count': 2, 'files': ['/test/test3/ingest_01_102', '/test/test3/ingest_01_339'], 'search_query': '{"$and":[{"path":{"$gte":"/test/test3","$max_depth":1}}, {"resources.name":{"$regex":"ingest_01_102|ingest_01_339"}}]}', 'search_id': 416837}]
- pyslk.core.has_no_flag_partial(resource: str | int | Path | list | None = None, search_id: str | int | None = None) bool #
Check if whether file(s) is/are flagged as partial; return True/False
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
True if no files is flagged; False otherwise
- Return type:
bool
- pyslk.core.has_no_flag_partial_details(resource: str | int | Path | list | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]] #
Check if whether file(s) is/are flagged as partial; returns dict with keys ‘flag_partial’ and ‘no_flag_partial’
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
dictionary with two keys ‘flag_partial’ and ‘no_flag_partial’ which each have a list of files as value
- Return type:
dict[str, list[Path]]
- pyslk.core.hsm2json(resources: str | Path | list | None = None, search_id: int = -1, recursive: bool = False, outfile: str | Path | None = None, restart_file: str | Path | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]] #
Extract metadata from HSM file(s) and return them in JSON structure
- Parameters:
resources (
str
orPath
orlist
) – list of resources to be searched forsearch_id (
int
) – id of a searchrecursive (
bool
) – export metadata from all files in gns_path recursivelyoutfile (
str
orPath
orNone
) – Write the output into a file instead to the stdoutrestart_file (
str
orPath
orNone
) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
orNone
) – import only metadata fields of listed schemata; if str: comma-separated list without spaceswrite_json_lines (
bool
= False) – write JSON-lines instead of JSONwrite_mode (
str
= None) – applies when ‘output’ is set; possible values: OVERWRITE, ERRORinstant_metadata_record_output (
bool
) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)print_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]
- Returns:
dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (JSON file); either ‘metadata’ or ‘file’ is none depending on the value of input argument ‘outfile’
- Return type:
dict
- pyslk.core.hsm2json_dict(resources: str | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, print_hidden: bool = False) dict[str, Union[dict, list, NoneType]] #
Extract metadata from HSM file(s) and return them in JSON structure
- Parameters:
resources (
str
orlist
) – list of resources to be searched forsearch_id (
int
) – id of a searchrecursive (
bool
) – export metadata from all files in gns_path recursivelyrestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spacesprint_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]
- Returns:
dictionary with keys ‘header’ (summary report), ‘metadata’ (actual metadata) and ‘file’ (None)
- Return type:
dict
- pyslk.core.hsm2json_file(outfile: str, resources: str | Path | list = '', search_id: int = -1, recursive: bool = False, restart_file: str | None = None, schema: str | list | None = None, write_json_lines: bool = False, write_mode: str | None = None, instant_metadata_record_output: bool = False, print_hidden: bool = False) None #
Extract metadata from HSM file(s) and return them in JSON structure
- Parameters:
outfile (
str
orPath
) – Write the output into a file instead to the stdoutresources (
str
orPath
orlist
) – list of resources to be searched forsearch_id (
int
) – id of a searchrecursive (
bool
) – export metadata from all files in gns_path recursivelyrestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
, list or None) – import only metadata fields of listed schemata; if str: comma-separated list without spaceswrite_json_lines (
bool
= False) – write JSON-lines instead of JSONwrite_mode (
str
= None) – applies when ‘output’ is set; possible values: OVERWRITE, ERRORinstant_metadata_record_output (
bool
) – False (default): read metadata of all files and write/print out afterward; True: write/print each metadata record after it has been read (requires ‘write_json_lines’)print_hidden (
bool
) – print read-only not-searchable metadata fields (sidecar file) [default: False]
- Returns:
nothing; throws an error if writing failed
- Return type:
None
- pyslk.core.is_cached(resource: str | int | Path | None = None, search_id: str | int | None = None) bool #
Check if whether file(s) is/are in HSM cache or not; returns True/False
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
True if all files are in cache; False otherwise
- Return type:
bool
- pyslk.core.is_cached_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]] #
Check if whether file(s) is/are in HSM cache or not; returns dict with keys ‘cached’ and ‘not_cached’
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
dictionary with two keys ‘cached’ and ‘not_cached’ which each have a list of files as value
- Return type:
dict[str, list[Path]]
- pyslk.core.is_file(resource: str | int | Path | None = None) bool | None #
Returns True if resource is a file
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
True if resource is a file; else False
- Return type:
str
orNone
- pyslk.core.is_namespace(resource: str | int | Path | None = None) bool | None #
Returns True if resource is a namespace
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource path- Returns:
True if resource is a namespace; else False
- Return type:
str
orNone
- pyslk.core.is_on_tape(resource: str | int | Path | None = None, search_id: str | int | None = None) bool #
Check if whether file(s) is/are stored on tape; returns True/False
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
True if all files are stored on tape; False otherwise
- Return type:
bool
- pyslk.core.is_on_tape_details(resource: str | int | Path | None = None, search_id: str | int | None = None) dict[str, list[pathlib.Path]] #
Check if whether file(s) is/are stored on tape or not; returns dict with keys ‘on_tape’ and ‘not_on_tape’
- Parameters:
resource (
str
orint
orpath-like
) – a resource id or a resource pathsearch_id (
int
orstr
) – id of a search
- Returns:
dictionary with two keys ‘on_tape’ and ‘not_on_tape’ which each have a list of files as value
- Return type:
dict[str, list[Path]]
- pyslk.core.is_tape_available(tape: int) bool | None #
Check if tape is available
- Parameters:
tape (
int
orstr
) – id or barcode of a tape in the tape library- Returns:
True if tape is available for recalls/retrievals; else False; None if tape does not exist
- Return type:
bool
orNone
- pyslk.core.json2hsm(json_file: str | None = None, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False, json_string: str | None = None) dict #
Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_file (
str
) – JSON input file containing metadatarestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been readjson_string (
str
) – provide a json string instead of a json file; incompatible with json_file
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.json_dict2hsm(json_dict: dict, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict #
Read metadata from JSON dictionary and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_dict (
dict
) – a dictionary representing JSONrestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.json_file2hsm(json_file: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict #
Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_file (
str
) – JSON input file containing metadatarestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.json_str2hsm(json_string: str, restart_file: str | None = None, schema: str | list | None = None, expect_json_lines: bool = False, verbose: bool = False, quiet: bool = False, ignore_non_existing_metadata_fields: bool = False, write_mode: str | None = None, instant_metadata_record_update: bool = False, use_res_id: bool = False, skip_bad_metadata_sets: bool = False) dict #
Read metadata from JSON file and write them to archived files into HSM. Use absolute paths from metadata records to identify target files.
- Parameters:
json_string (
str
) – JSON string containing metadatarestart_file (
str
= None,) – set a restart file in which the processed metadata entries are listedschema (
str
orlist
) – import only metadata fields of listed schemata; if str: comma-separated list without spacesexpect_json_lines (
bool
) – read JSON-lines from file instead of JSONverbose (
bool
) – verbose modequiet (
bool
) – quiet modeignore_non_existing_metadata_fields (
bool
) – do not throw an error if a metadata field is used, which does not exist in StrongLinkwrite_mode (
str
) – select write mode for metadata: OVERWRITE, KEEP, ERROR, CLEANuse_res_id (
bool
) – use resource_id instead of path to identify fileskip_bad_metadata_sets (
bool
) – skip damaged / incomplete metadata sets [default: throw error]instant_metadata_record_update (
bool
) – False (default): read metadata records of all files and import into StrongLink afterward; True: import each metadata record after it has been read
- Returns:
metadata import summary (key ‘header’)
- Return type:
dict
- pyslk.core.makedirs(gns_path: str | Path, exist_ok: bool = False) int #
Create a directory like ‘mkdir()’ but create parent directories recursively, if they do not exist
If exist_ok is False (the default), a FileExistsError is raised if the target directory already exists.
- Parameters:
gns_path (
str
orPath
) – gns path to createexist_ok (
bool
) – throw no error if folder already exists (like ‘mkdir -p’)
- Returns:
namespace/resource id of the created namespace
- Return type:
int
See also
- pyslk.core.mkdir(gns_path: str | Path) int #
Create a directory
If the directory already exists,
FileExistsError
is raised. If a parent directory in the path does not exist,FileNotFoundError
is raised.- Parameters:
gns_path (
str
orPath
) – gns path to create- Returns:
namespace/resource id of the created namespace
- Return type:
int
See also
- pyslk.core.move(src_path: str, dst_gns: str, no_overwrite: bool) int #
Move namespaces/files from one parent folder to another; renaming is not possible
- Parameters:
src_path (
str
) – namespace or file (full GNS path)dst_gns (
str
) – new parent namespaceno_overwrite (
bool
) – do not overwrite target file if it exists
- Returns:
return resource id of the moved resource
- Return type:
int
- pyslk.core.rename(old_name: str, new_name: str) int #
Rename a folder or file; moving is not possible
- Parameters:
old_name (
str
) – folder or file name (full GNS path)new_name (
str
) – new name (only name; no full GNS path)
- Returns:
return resource id of the renamed resource
- Return type:
int
- pyslk.core.resource_exists(resource: str | Path | int) bool #
Check if resource exists and return True/False
- Parameters:
resource (
str
orpath-like
) – namespace or resource- Returns:
True if file exists; False otherwise
- Return type:
bool
- pyslk.core.set_tag(path_or_id: str | int, metadata: dict, recursive: bool = False) dict | None #
Apply metadata to the namespace and child resources
- Parameters:
path_or_id (
str
orint
) – search id or gns path of resources to retrievemetadata (
dict
) – dict that holds as keys “[metadata schema].[field]” and as values the metadata valuesrecursive (
bool
) – use the -R flag to tag recursively, Default: False
- Returns:
new metadata of the target files
- Return type:
dict
orNone
- pyslk.core.tape_exists(tape: int | str) bool #
Check if tape exists
- Parameters:
tape (
int
orstr
) – id or barcode of a tape in the tape library- Returns:
True if tape exists; False otherwise
- Return type:
bool
- pyslk.core.tape_status(tape: int | str, details: bool = False) str | None #
Check the status of a tape
- Parameters:
tape (
int
orstr
) – id or barcode of a tape in the tape librarydetails (
bool
) – print a more detailed description of the retrieval status
- Returns:
status of the tape; None if tape does not exist
- Return type:
str
orNone