pyslk API reference

back to index page

pyslk v0.5.8

module overview

The pyslk package provides Python wrapper functions for the slk and the slk_helpers. The slk is the official command line interface (cli) of the StrongLink system by StrongBox Data Solutions. The slk_helpers was developed at DKRZ and provides some additional functionality beyond the slk.

  • pyslk.pyslk: basic wrappers for nearly all slk and slk_helpers commands

  • pyslk.parsers: extended wrappers for a few slk and slk_helpers commands

  • pyslk.utils: some utility functions

  • pyslk.exceptions: one exception class used in this package

pyslk.pyslk module

pyslk.pyslk provides basic Python wrapper functions for all commands of slk and of slk_helpers. If a command returns an exit code of !=0, the calling wrapper function will throw a PySlkException.

The wrapper functions are made for slk version 3.3.6 and higher. If this package is used with an older slk version, some optional arguments might not work and their usage might cause PySlkExceptions.

pyslk.pyslk.slk_arch_size(path_or_id)

Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes

Parameters

path_or_id (str) – search id or gns path

Returns

archive size in byte

Return type

float

pyslk.pyslk.slk_archive(src_path, dst_gns, partsize=None, streams=None, recursive=False, preserve_permissions=False, excluse_hidden=False)
Upload files in a directory and optionally tag resources

using directory path and GNS path

Parameters
  • src_path (str) – search id or gns path of resources to retrieve

  • dst_gns (str) – destination directory for retrieval

  • partsize (int) – size of each file to download per stream, Default: 500

  • streams (int) – number of file part streams to use per node, Default: 4

  • recursive (bool) – use the -R flag to archive recursively, Default: False

  • preserve_permissions (bool) – preserve original file permission, Default: False

  • excluse_hidden (bool) – exclude . (hidden) files, Default: False

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_checksum(resource_path, checksum_type=None)

Get a checksum of a resource

Parameters
  • resource_path – resource (full path)

  • checksum_type (str) – checksum_type (possible values: None, “sha512”, “adler32”; None => print all)

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_chmod(gns_path, mode, recursive=False)

Change the access mode of a resource or namespace

Parameters
  • gns_path (str) – namespace or file (full GNS path)

  • mode (str or int) – new mode/permissions of a file (as known from bash’s chmod)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_delete(gns_path, recursive=False)
Soft delete a namespace (optionally all child objects of a non-empty

namespace) or a specific file

Parameters
  • gns_path (str) – namespace or file (full GNS path)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_exists(gns_path)

Check if resource exists

Parameters

gns_path (str) – namespace or resource

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_gen_file_query(resources, recursive=False)

Generates a search query that searches for the listed resources

A search query will be generated which connects all elements of ‘resources’ with and ‘or’. A ‘resource’ (an element of ‘resources’) might be one of these:

  • a filename with full path (e.g. /arch/bm0146/k204221/INDEX.txt)

  • a filename without full path (e.g. INDEX.txt)

  • a regex describing a filename (e.g. /arch/bm0146/k204221/.*.txt)

  • a namespace (e.g. /arch/bm0146/k204221 or /arch/bm0146/k204221/)

Details are given in the slk_helpers documentation at https://docs.dkrz.de

Parameters
  • resources (str or list) – list of resources to be searched for

  • recursive (bool) – do recursive search in the namespaces

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_group(gns_path, group, recursive=False)

Change the group of a resource or namespace

Parameters
  • gns_path (str) – namespace or file (full GNS path)

  • group (str or int) – new group of a file (group name or gid)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_helpers_version()

List the version of slk_helpers

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_hostname()

Shows current hostname you are connected to

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_iscached(resource_path)

Get info whether file is in HSM cache or not

Parameters

resource_path – resource (full path)

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_list(path_or_id, all=False, numeric_ids=False, recursive=False, text=False)

List results from search id or GNS path

Parameters
  • path_or_id (str) – search id or gns path

  • all (bool) – show ‘.’ files, default: False (don’t show these files)

  • numeric_ids (bool) – show numeric values for user and group, default: False (show user and group names)

  • recursive (bool) – use the -R flag to list recursively, default: False

  • text (bool) – print result to file ‘slk_${USER}_list.txt’, default: False (print to command line / print non-empty return string)

Returns

stdout of the slk call

Return type

str

List results from search id

Parameters
  • search_id (str) – search id of search which results should be printed

  • only_files (bool) – print only files (like default for slk list)

  • only_namespaces (bool) – print only namespaces

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_metadata(resource_path)

Get metadata

Parameters

resource_path – resource (full path)

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_mkdir(gns_path, recursive=False)

Create a directory

Parameters
  • gns_path (str) – gns path to create

  • recursive (bool) – use the -R create folders recursively, if the parent folders do not exist

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_move(src_path, dst_gns)

Move namespaces/files from one parent folder to another; renaming is not possible

Parameters
  • src_path (str) – namespace or file (full GNS path)

  • dst_path (str) – new parent namespace

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_owner(gns_path, owner, recursive=False)

Change the owner of a resource or namespace

Parameters
  • gns_path (str) – namespace or file (full GNS path)

  • owner (str or int) – new owner of a file (username or uid)

  • recursive (bool) – use the -R flag to delete recursively, Default: False

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_recall(path_or_id, recursive=False)

Recall files from tape to cache via search id or GNS path

Parameters
  • path_or_id (str) – search id or gns path of resources to recall

  • recursive (bool) – use the -R flag to recall recursively, Default: False

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_rename(old_name, new_name)

Rename a folder or file; moving is not possible

Parameters
  • old_name (str) – folder or file name (full GNS path)

  • new_name (str) – new name (only name; no full GNS path)

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_resourcepath(resource_id)

Get path for a resource id

Parameters

resource_id (str or int) – a resource_id

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_retrieve(path_or_id, dest_dir, partsize=None, streams=None, recursive=False)

DEACTIVATED: Retrieve files via search id or GNS path

Parameters
  • path_or_id (str) – search id or gns path of resources to retrieve

  • dest_dir (str) – destination directory for retrieval

  • partsize (int) – size of each file to download per stream, Default: 500

  • streams (int) – number of file part streams to use per node, Default: 4

  • recursive (bool) – use the -R flag to retrieve recursively, Default: False

Returns

stdout of the slk call

Return type

str

DEACTIVATED: Creates search and returns search id

Either group, user and/or name can be set or a search_string can be provided.

search_string has to be a valid JSON search string as described in the SLK-CLI manual. Simple double quotes (“) has to be used in the JSON expression (no escaped double quotes, no escaped special characters).

Example for search_string:

search_string=’{“resources.mtime”: {“$gt”: “2021-09-02”}}’

Parameters
  • group (str or int) – search for files belonging to the provided group name or GID

  • user (str or int) – search for files belonging to the provided username or UID

  • name (str) – search files having the provided name

  • search_string (str) – JSON search query string

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_search_limited(search_string)

Performs a search based on a search_string can be provided.

search_string has to be a valid JSON search string as described in the SLK-CLI manual. Simple double quotes (“) has to be used in the JSON expression (no escaped double quotes, no escaped special characters).

Example for search_string:

search_string=’{“resources.mtime”: {“$gt”: “2021-09-02”}}’

Parameters

search_string (str) – JSON search query string

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_session()

Shows expiration date of your token

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_size(resource_path)

Returns file size in byte

Parameters

resource_path (str) – namespace or resource

Returns

stdout of the slk_helpers call

Return type

str

pyslk.pyslk.slk_tag(path_or_id, metadata, recursive=False)

Apply metadata to the namespace and child resources

Parameters
  • path_or_id (str) – search id or gns path of resources to retrieve

  • metadata (dict) – dict that holds as keys “[metadata schema].[field]” and as values the metadata values

  • recursive (bool) – use the -R flag to tag recursively, Default: False

Returns

stdout of the slk call

Return type

str

pyslk.pyslk.slk_version()

List the version of slk

Returns

stdout of the slk call

Return type

str

pyslk.parsers module

pyslk.parsers provides functions that parse the output of the basic wrapper functions of pyslk.pyslk and provide a nicer-to-work-with output.

pyslk.parsers.slk_arch_size_format(path_or_id, return_format='B')

DEPRECATED. Please use ‘slk_arch_size_formatted’ instead

Parameters
  • path_or_id (str) – search id or gns path

  • return_format (str) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B

Returns

archive size

Return type

string

pyslk.parsers.slk_arch_size_formatted(path_or_id, return_format='B')

Get archive size from search id or GNS path by recursively listing all files of archive and adding file sizes

Parameters
  • path_or_id (str) – search id or gns path

  • return_format (str) – Prefix of returned size must be one of B, K, M, G, T, P or h for Byte, Kilobyte, Megabyte Gigabyte, Terrabyte, Petabyte or “human-readable”; default: B

Returns

archive size

Return type

string

pyslk.parsers.slk_exists_bool(gns_path)

Check if resource exists and return True/False

Parameters

gns_path (str) – namespace or resource

Returns

True if file exists; False otherwise

Return type

bool

pyslk.parsers.slk_iscached_bool(gns_path)

Check if whether file is in HSM cache or not; return True/False

Parameters

gns_path (str) – namespace or resource

Returns

True if file is in cache; False otherwise

Return type

bool

pyslk.parsers.slk_list_formatted(path_or_id, all=False, numeric_ids=False, recursive=False, text=False, column_widths=[12, 12, 12, 9, 4, 4, 5, 999], column_names=['permissions', 'owner', 'group', 'filesize', 'day', 'month', 'year', 'filename'], parse_dates=True, parse_sizes=True, full_path=True)

Return pandas.DataFrame containing results from search id or GNS path

Calls ‘pyslk.pyslk.slk_list(…)’ and parses the return string into a pandas.DataFrame. All arguments are copied 1:1 into the ‘slk_list’ call except for ‘column_widths’. Assumes six output columns of ‘slk list’ having the widths 12, 12, 12, 9, 4, 4, 5 and 999 (999 => as wide as neceessary).

Note: ‘slk list’ currently only print the modification date and no modification time. The output of this parser might be modified in future when modification times are printed as well.

Parameters
  • path_or_id (str) – search id or gns path

  • all (bool) – show ‘.’ files, default: False (don’t show these files)

  • numeric_ids (bool) – show numeric values for user and group, default: False (show user and group names)

  • recursive (bool) – use the -R flag to list recursively, default: False

  • text (bool) – print result to file ‘slk_${USER}_list.txt’, default: False (print to command line / print non-empty return string)

  • column_widths (list) – fixed widths of columns to be split

  • column_names (list) – names of the columns in the pandas.DataFrame

  • parse_dates (bool) – parse day, month and year into a datetime column.

  • parse_sizes – parse filesize column into bytes integer.

  • full_path (bool) – add full filepath to filename column.

Returns

output of ‘slk list’ parsed into a pandas.DataFrame with eight columns (permissions, owner, group, size, day, month, year, filename)

Return type

pandas.DataFrame

pyslk.parsers.slk_list_search_formatted(search_id, only_files=False, only_namespaces=False, column_widths=[12, 16, 999], column_names=['permissions', 'filesize', 'filename'])

Return pandas.DataFrame containing results from search id

Calls ‘pyslk.pyslk.slk_list_search(…)’ and parses the return string into a pandas.DataFrame. Assumes six output columns of ‘slk_helpers list_search’ having the widths 12, 16, 999 (999 => as wide as neceessary).

Parameters
  • search_id (str) – search id of search which results should be printed

  • only_files (bool) – print only files (like default for slk list)

  • only_namespaces (bool) – print only namespaces

  • column_widths (list) – fixed widths of columns to be split

  • column_names (list) – names of the columns in the pandas.DataFrame

Returns

output of ‘slk_helpers list_search’ parsed into a pandas.DataFrame with three columns (permissions, size, filename)

Return type

pandas.DataFrame

pyslk.parsers.slk_search_limited_int(search_string)

Performs a search based on a search_string can be provided. returns search id

search_string has to be a valid JSON search string as described in the SLK-CLI manual. Simple double quotes (“) has to be used in the JSON expression (no escaped double quotes, no escaped special characters).

Example for search_string:

search_string=’{“resources.mtime”: {“$gt”: “2021-09-02”}}’

Parameters

search_string (str) – JSON search query string

Returns

search id of the performed search

Return type

int

pyslk.parsers.valid_session()

Returns whether session token is valid or not

Returns

True if valid token exists; False otherwise

Return type

bool

pyslk.parsers.valid_token()

Returns whether session token is valid or not

Returns

True if valid token exists; False otherwise

Return type

bool

pyslk.utils module

pyslk.utils contains utilies needed by pyslk.pyslk.

pyslk.utils.file_list_2_search_query(file_list)

generates a RQL search query to find listed files

We get a list of files (with absolute path) and create a RQL search query.

Parameters

file_list (list of str) – list of files with absolute path

Returns

search query to find the files listed in file_list

Return type

str

pyslk.utils.run_slk(slk_call, fun)

Runs a provided slk_call via subprocess.run and returns stdout

The argument ‘slk_call’ is used as argument in a call of ‘subprocess.run(…)’. If slk_call is a string then ‘shell=True’ is set in the run(…)-call. Otherwise, the argument ‘shell’ is omitted.

If the command provided via ‘slk_call’ does not exist (e.g. ‘slk’ not available on this system) or if ‘slk’ returns an exit code ‘!= 0’ then a PySlkException is raised.

Parameters
  • slk_call (str or list of str) – slk call as input for subprocess.run

  • fun (str) – name of the calling pyslk function; printed when exception is raised

Returns

stdout of the slk call

Return type

str

pyslk.exceptions module

pyslk.exceptions provides the class PySlkException, which is the general exception for the pyslk package.

exception pyslk.exceptions.PySlkException

Bases: Exception

A PySlkException derived from ‘Exception’

Changelog

0.5.8 (2022-05-12)

  • added new functions to generate search queries from a file list

0.5.7 (2022-04-14)

  • minor changes in the package publishing workflow

0.5.6 (2022-04-14)

  • removed argument return_format from pyslk.slk_arch_size (was accidentally kept in previous version)

0.5.5 (2022-03-21)

  • utils._parse_size(...) returns a math.nan if a file size cannot be converted

  • fucntions that use utils._parse_size(...) were adapted

  • Thanks to Lars Buntemeyer (GERICS) and Helge Heuer (DLR) for contributions to 0.5.x

0.5.4 (2022-03-21)

  • new function pyslk.slk_arch_size(...) simplified (removed “format” argument)

  • new function parsers.slk_arch_size_format(...)

  • internal restructuring of modules parsers and utils

  • moving functions between parsers and utils

  • Pandas is not required for the whole parsers module

  • new module constants to host constants

0.5.3 (2022-03-18)

  • new function pyslk.slk_arch_size(...) (contributed by Helge Heuer, DLR)

  • updated doc: pyslk installed on levante

0.5.2 (2022-03-14)

  • minor version number fixes

0.5.1 (2022-03-14)

  • build conda packages for python 3.9 and 3.10

0.5.0 (2022-03-14)

  • slk group is activated now

  • pyslk extended by new slk_helpers commands: version, iscached and search_limited; correspond to pyslk.slk_helpers_version(...), pyslk.slk_iscached(...) and pyslk.slk_search_limited(...)

  • path containing //, /../ and /./ now properly work with all commands

  • parsers extended by parsers.slk_isached_bool and parsers.slk_search_limited_int

0.4.0 (2021-11-14)

  • argument updates in pyslk.pyslk (see below)

  • argument updates in pyslk.parsers (see below)

  • removed non-existing arguments in pyslk.pyslk.recall

  • argument name changes (everywhere, see above):
    • R => recursive

    • n => numeric_ids

    • x => exclude_hidden

    • a => all

    • p => preserve_permissions

  • pyslk.parsers.slk_list_formatted considerably enhanced

  • doc updates

0.3.5 (2021-11-10)

  • activated slk retrieve (pyslk.slk_retrieve)

  • throw proper PySlkException when slk or slk_helpers are not available

0.3.4 (2021-10-25)

  • minor corrections in the documentation

  • deactivated slk retrieve (pyslk.slk_retrieve)

0.3.2 (2021-10-23)

  • minor editorial corrections and addons in the descriptions and readme

0.3.1 (2021-10-23)

  • updated doc structure

  • modified package description in __init__.py

  • updated README.md

0.3.0 (2021-10-23)

  • masked pyslk.slk_search and pyslk.slk_group because they do not exist in the slk version for public release (will come later with an update)

  • updated descriptions

  • code layout updates after flake8 checks

  • basic structure for doc generation

0.2.2 (2021-10-02)

  • new function: parsers.slk_exists_bool()

0.2.1 (2021-10-02)

  • new functions: parsers.valid_token() and parsers.valid_session() (synonyms)

0.2.0 (2021-10-01)

  • package setup:
    • setup.py, setup.cfg

    • requirements*.txt files

    • LICENCE

    • README.md

  • finalized all slk functions (group, owner, chmod, tag)

  • split one large source file into several modules

  • updates in some function headers (pyslk.slk_delete, pyslk.slk_tag)

  • several minor bug fixes

0.1.0 (2021-09-20)

Initial Release