pyslk API reference¶
pyslk v0.5.8
module overview¶
The pyslk package provides Python wrapper functions for the slk and the slk_helpers. The slk is the official command line interface (cli) of the StrongLink system by StrongBox Data Solutions. The slk_helpers was developed at DKRZ and provides some additional functionality beyond the slk.
pyslk.pyslk: basic wrappers for nearly all slk and slk_helpers commands
pyslk.parsers: extended wrappers for a few slk and slk_helpers commands
pyslk.utils: some utility functions
pyslk.exceptions: one exception class used in this package
pyslk.pyslk module¶
pyslk.pyslk provides basic Python wrapper functions for all commands of slk and of slk_helpers. If a command returns an exit code of !=0, the calling wrapper function will throw a PySlkException.
The wrapper functions are made for slk version 3.3.6 and higher. If this package is used with an older slk version, some optional arguments might not work and their usage might cause PySlkExceptions.
- pyslk.pyslk.slk_arch_size(path_or_id)¶
Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes
- Parameters
path_or_id (str) – search id or gns path
- Returns
archive size in byte
- Return type
float
- pyslk.pyslk.slk_archive(src_path, dst_gns, partsize=None, streams=None, recursive=False, preserve_permissions=False, excluse_hidden=False)¶
- Upload files in a directory and optionally tag resources
using directory path and GNS path
- Parameters
src_path (str) – search id or gns path of resources to retrieve
dst_gns (str) – destination directory for retrieval
partsize (int) – size of each file to download per stream, Default: 500
streams (int) – number of file part streams to use per node, Default: 4
recursive (bool) – use the -R flag to archive recursively, Default: False
preserve_permissions (bool) – preserve original file permission, Default: False
excluse_hidden (bool) – exclude . (hidden) files, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_checksum(resource_path, checksum_type=None)¶
Get a checksum of a resource
- Parameters
resource_path – resource (full path)
checksum_type (str) – checksum_type (possible values: None, “sha512”, “adler32”; None => print all)
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_chmod(gns_path, mode, recursive=False)¶
Change the access mode of a resource or namespace
- Parameters
gns_path (str) – namespace or file (full GNS path)
mode (str or int) – new mode/permissions of a file (as known from bash’s chmod)
recursive (bool) – use the -R flag to delete recursively, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_delete(gns_path, recursive=False)¶
- Soft delete a namespace (optionally all child objects of a non-empty
namespace) or a specific file
- Parameters
gns_path (str) – namespace or file (full GNS path)
recursive (bool) – use the -R flag to delete recursively, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_exists(gns_path)¶
Check if resource exists
- Parameters
gns_path (str) – namespace or resource
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_gen_file_query(resources, recursive=False)¶
Generates a search query that searches for the listed resources
A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:
a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)
a filename without full path (e.g. INDEX.txt)
a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)
a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)
Details are given in the slk_helpers documentation at https://docs.dkrz.de
- Parameters
resources (str or list) – list of resources to be searched for
recursive (bool) – do recursive search in the namespaces
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_group(gns_path, group, recursive=False)¶
Change the group of a resource or namespace
- Parameters
gns_path (str) – namespace or file (full GNS path)
group (str or int) – new group of a file (group name or gid)
recursive (bool) – use the -R flag to delete recursively, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_helpers_version()¶
List the version of slk_helpers
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_hostname()¶
Shows current hostname you are connected to
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_iscached(resource_path)¶
Get info whether file is in HSM cache or not
- Parameters
resource_path – resource (full path)
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_list(path_or_id, all=False, numeric_ids=False, recursive=False, text=False)¶
List results from search id or GNS path
- Parameters
path_or_id (str) – search id or gns path
all (bool) – show ‘.’ files, default: False (don’t show these files)
numeric_ids (bool) – show numeric values for user and group, default: False (show user and group names)
recursive (bool) – use the -R flag to list recursively, default: False
text (bool) – print result to file ‘slk_${USER}_list.txt’, default: False (print to command line / print non-empty return string)
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_list_search(search_id, only_files=False, only_namespaces=False)¶
List results from search id
- Parameters
search_id (str) – search id of search which results should be printed
only_files (bool) – print only files (like default for slk list)
only_namespaces (bool) – print only namespaces
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_metadata(resource_path)¶
Get metadata
- Parameters
resource_path – resource (full path)
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_mkdir(gns_path, recursive=False)¶
Create a directory
- Parameters
gns_path (str) – gns path to create
recursive (bool) – use the -R create folders recursively, if the parent folders do not exist
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_move(src_path, dst_gns)¶
Move namespaces/files from one parent folder to another; renaming is not possible
- Parameters
src_path (str) – namespace or file (full GNS path)
dst_path (str) – new parent namespace
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_owner(gns_path, owner, recursive=False)¶
Change the owner of a resource or namespace
- Parameters
gns_path (str) – namespace or file (full GNS path)
owner (str or int) – new owner of a file (username or uid)
recursive (bool) – use the -R flag to delete recursively, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_recall(path_or_id, recursive=False)¶
Recall files from tape to cache via search id or GNS path
- Parameters
path_or_id (str) – search id or gns path of resources to recall
recursive (bool) – use the -R flag to recall recursively, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_rename(old_name, new_name)¶
Rename a folder or file; moving is not possible
- Parameters
old_name (str) – folder or file name (full GNS path)
new_name (str) – new name (only name; no full GNS path)
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_resourcepath(resource_id)¶
Get path for a resource id
- Parameters
resource_id (str or int) – a resource_id
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_retrieve(path_or_id, dest_dir, partsize=None, streams=None, recursive=False)¶
DEACTIVATED: Retrieve files via search id or GNS path
- Parameters
path_or_id (str) – search id or gns path of resources to retrieve
dest_dir (str) – destination directory for retrieval
partsize (int) – size of each file to download per stream, Default: 500
streams (int) – number of file part streams to use per node, Default: 4
recursive (bool) – use the -R flag to retrieve recursively, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_search(group=None, user=None, name=None, search_string=None)¶
DEACTIVATED: Creates search and returns search id
Either group, user and/or name can be set or a search_string can be provided.
search_string has to be a valid JSON search string as described in the SLK-CLI manual. Simple double quotes (“) has to be used in the JSON expression (no escaped double quotes, no escaped special characters).
- Example for search_string:
search_string=’{“resources.mtime”: {“$gt”: “2021-09-02”}}’
- Parameters
group (str or int) – search for files belonging to the provided group name or GID
user (str or int) – search for files belonging to the provided username or UID
name (str) – search files having the provided name
search_string (str) – JSON search query string
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_search_limited(search_string)¶
Performs a search based on a search_string can be provided.
search_string has to be a valid JSON search string as described in the SLK-CLI manual. Simple double quotes (“) has to be used in the JSON expression (no escaped double quotes, no escaped special characters).
- Example for search_string:
search_string=’{“resources.mtime”: {“$gt”: “2021-09-02”}}’
- Parameters
search_string (str) – JSON search query string
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_session()¶
Shows expiration date of your token
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_size(resource_path)¶
Returns file size in byte
- Parameters
resource_path (str) – namespace or resource
- Returns
stdout of the slk_helpers call
- Return type
str
- pyslk.pyslk.slk_tag(path_or_id, metadata, recursive=False)¶
Apply metadata to the namespace and child resources
- Parameters
path_or_id (str) – search id or gns path of resources to retrieve
metadata (dict) – dict that holds as keys “[metadata schema].[field]” and as values the metadata values
recursive (bool) – use the -R flag to tag recursively, Default: False
- Returns
stdout of the slk call
- Return type
str
- pyslk.pyslk.slk_version()¶
List the version of slk
- Returns
stdout of the slk call
- Return type
str
pyslk.parsers module¶
pyslk.parsers provides functions that parse the output of the basic wrapper functions of pyslk.pyslk and provide a nicer-to-work-with output.
- pyslk.parsers.slk_arch_size_format(path_or_id, return_format='B')¶
DEPRECATED. Please use ‘slk_arch_size_formatted’ instead
- Parameters
path_or_id (str) – search id or gns path
return_format (str) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B
- Returns
archive size
- Return type
string
- pyslk.parsers.slk_arch_size_formatted(path_or_id, return_format='B')¶
Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes
- Parameters
path_or_id (str) – search id or gns path
return_format (str) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B
- Returns
archive size
- Return type
string
- pyslk.parsers.slk_exists_bool(gns_path)¶
Check if resource exists and return True/False
- Parameters
gns_path (str) – namespace or resource
- Returns
True if file exists; False otherwise
- Return type
bool
- pyslk.parsers.slk_iscached_bool(gns_path)¶
Check if whether file is in HSM cache or not; return True/False
- Parameters
gns_path (str) – namespace or resource
- Returns
True if file is in cache; False otherwise
- Return type
bool
- pyslk.parsers.slk_list_formatted(path_or_id, all=False, numeric_ids=False, recursive=False, text=False, column_widths=[12, 12, 12, 9, 4, 4, 5, 999], column_names=['permissions', 'owner', 'group', 'filesize', 'day', 'month', 'year', 'filename'], parse_dates=True, parse_sizes=True, full_path=True)¶
Return pandas.DataFrame containing results from search id or GNS path
Calls ‘pyslk.pyslk.slk_list(…)’ and parses the return string into a pandas.DataFrame. All arguments are copied 1:1 into the ‘slk_list’ call except for ‘column_widths’. Assumes six output columns of ‘slk list’ having the widths 12, 12, 12, 9, 4, 4, 5 and 999 (999 => as wide as neceessary).
Note: ‘slk list’ currently only print the modification date and no modification time. The output of this parser might be modified in future when modification times are printed as well.
- Parameters
path_or_id (str) – search id or gns path
all (bool) – show ‘.’ files, default: False (don’t show these files)
numeric_ids (bool) – show numeric values for user and group, default: False (show user and group names)
recursive (bool) – use the -R flag to list recursively, default: False
text (bool) – print result to file ‘slk_${USER}_list.txt’, default: False (print to command line / print non-empty return string)
column_widths (list) – fixed widths of columns to be split
column_names (list) – names of the columns in the pandas.DataFrame
parse_dates (bool) – parse day, month and year into a datetime column.
parse_sizes – parse filesize column into bytes integer.
full_path (bool) – add full filepath to filename column.
- Returns
output of ‘slk list’ parsed into a pandas.DataFrame with eight columns (permissions, owner, group, size, day, month, year, filename)
- Return type
pandas.DataFrame
- pyslk.parsers.slk_list_search_formatted(search_id, only_files=False, only_namespaces=False, column_widths=[12, 16, 999], column_names=['permissions', 'filesize', 'filename'])¶
Return pandas.DataFrame containing results from search id
Calls ‘pyslk.pyslk.slk_list_search(…)’ and parses the return string into a pandas.DataFrame. Assumes six output columns of ‘slk_helpers list_search’ having the widths 12, 16, 999 (999 => as wide as neceessary).
- Parameters
search_id (str) – search id of search which results should be printed
only_files (bool) – print only files (like default for slk list)
only_namespaces (bool) – print only namespaces
column_widths (list) – fixed widths of columns to be split
column_names (list) – names of the columns in the pandas.DataFrame
- Returns
output of ‘slk_helpers list_search’ parsed into a pandas.DataFrame with three columns (permissions, size, filename)
- Return type
pandas.DataFrame
- pyslk.parsers.slk_search_limited_int(search_string)¶
Performs a search based on a search_string can be provided. returns search id
search_string has to be a valid JSON search string as described in the SLK-CLI manual. Simple double quotes (“) has to be used in the JSON expression (no escaped double quotes, no escaped special characters).
- Example for search_string:
search_string=’{“resources.mtime”: {“$gt”: “2021-09-02”}}’
- Parameters
search_string (str) – JSON search query string
- Returns
search id of the performed search
- Return type
int
- pyslk.parsers.valid_session()¶
Returns whether session token is valid or not
- Returns
True if valid token exists; False otherwise
- Return type
bool
- pyslk.parsers.valid_token()¶
Returns whether session token is valid or not
- Returns
True if valid token exists; False otherwise
- Return type
bool
pyslk.utils module¶
pyslk.utils contains utilies needed by pyslk.pyslk.
- pyslk.utils.file_list_2_search_query(file_list)¶
generates a RQL search query to find listed files
We get a list of files (with absolute path) and create a RQL search query.
- Parameters
file_list (list of str) – list of files with absolute path
- Returns
search query to find the files listed in file_list
- Return type
str
- pyslk.utils.run_slk(slk_call, fun)¶
Runs a provided slk_call via subprocess.run and returns stdout
The argument ‘slk_call’ is used as argument in a call of ‘subprocess.run(…)’. If slk_call is a string then ‘shell=True’ is set in the run(…)-call. Otherwise, the argument ‘shell’ is omitted.
If the command provided via ‘slk_call’ does not exist (e.g. ‘slk’ not available on this system) or if ‘slk’ returns an exit code ‘!= 0’ then a PySlkException is raised.
- Parameters
slk_call (str or list of str) – slk call as input for subprocess.run
fun (str) – name of the calling pyslk function; printed when exception is raised
- Returns
stdout of the slk call
- Return type
str
pyslk.exceptions module¶
pyslk.exceptions provides the class PySlkException, which is the general exception for the pyslk package.
- exception pyslk.exceptions.PySlkException¶
Bases:
Exception
A PySlkException derived from ‘Exception’
Changelog¶
0.5.8 (2022-05-12)¶
added new functions to generate search queries from a file list
0.5.7 (2022-04-14)¶
minor changes in the package publishing workflow
0.5.6 (2022-04-14)¶
removed argument
return_format
frompyslk.slk_arch_size
(was accidentally kept in previous version)
0.5.5 (2022-03-21)¶
utils._parse_size(...)
returns amath.nan
if a file size cannot be convertedfucntions that use
utils._parse_size(...)
were adaptedThanks to Lars Buntemeyer (GERICS) and Helge Heuer (DLR) for contributions to 0.5.x
0.5.4 (2022-03-21)¶
new function
pyslk.slk_arch_size(...)
simplified (removed “format” argument)new function
parsers.slk_arch_size_format(...)
internal restructuring of modules
parsers
andutils
moving functions between
parsers
andutils
Pandas
is not required for the whole parsers modulenew module
constants
to host constants
0.5.3 (2022-03-18)¶
new function
pyslk.slk_arch_size(...)
(contributed by Helge Heuer, DLR)updated doc: pyslk installed on levante
0.5.2 (2022-03-14)¶
minor version number fixes
0.5.1 (2022-03-14)¶
build conda packages for python 3.9 and 3.10
0.5.0 (2022-03-14)¶
slk group
is activated nowpyslk
extended by newslk_helpers
commands:version
,iscached
andsearch_limited
; correspond topyslk.slk_helpers_version(...)
,pyslk.slk_iscached(...)
andpyslk.slk_search_limited(...)
path containing
//
,/../
and/./
now properly work with all commandsparsers
extended byparsers.slk_isached_bool
andparsers.slk_search_limited_int
0.4.0 (2021-11-14)¶
argument updates in pyslk.pyslk (see below)
argument updates in pyslk.parsers (see below)
removed non-existing arguments in pyslk.pyslk.recall
- argument name changes (everywhere, see above):
R => recursive
n => numeric_ids
x => exclude_hidden
a => all
p => preserve_permissions
pyslk.parsers.slk_list_formatted considerably enhanced
doc updates
0.3.5 (2021-11-10)¶
activated slk retrieve (
pyslk.slk_retrieve
)throw proper
PySlkException
whenslk
orslk_helpers
are not available
0.3.4 (2021-10-25)¶
minor corrections in the documentation
deactivated slk retrieve (
pyslk.slk_retrieve
)
0.3.2 (2021-10-23)¶
minor editorial corrections and addons in the descriptions and readme
0.3.1 (2021-10-23)¶
updated doc structure
modified package description in
__init__.py
updated README.md
0.3.0 (2021-10-23)¶
masked
pyslk.slk_search
andpyslk.slk_group
because they do not exist in the slk version for public release (will come later with an update)updated descriptions
code layout updates after flake8 checks
basic structure for doc generation
0.2.2 (2021-10-02)¶
new function:
parsers.slk_exists_bool()
0.2.1 (2021-10-02)¶
new functions:
parsers.valid_token()
andparsers.valid_session()
(synonyms)
0.2.0 (2021-10-01)¶
- package setup:
setup.py, setup.cfg
requirements*.txt
filesLICENCE
README.md
finalized all slk functions (group, owner, chmod, tag)
split one large source file into several modules
updates in some function headers (
pyslk.slk_delete
,pyslk.slk_tag
)several minor bug fixes
0.1.0 (2021-09-20)¶
Initial Release