Execution¶
Using git_info
¶
To use, there are a number of ways to import it the main class,
GitInfo
.
import redata
code_path = "/path/to/repo"
gi = redata.commons.git_info.GitInfo(code_path)
or
from redata.commons.git_info import GitInfo
code_path = "/path/to/repo"
gi = GitInfo(code_path)
Using logger
¶
There are a number of functions and classes available with logger
:
LogClass
: The mainLogger
object for stdout and file logging
LogCommons
: Object that has methods to simplify repetitive logging
log_stdout
: Function for stdout logging
log_setup
: Function to set-up stdout and file logging. CallLogClass.get_logger()
get_user_hostname
: Function to retrieve system information (user, host, IP, OS)
get_log_file
: Function to retrieve filenames for file logging
log_settings
: Function to log configuration settings. Only arguments specified through CLI arguments are shown
pandas_write_buffer
: Function to write a prettified (i.e., Markdown) version of the table to log handler(s)
First you can either import logger
via:
import redata
or
from redata.commons import logger
To construct a stdout and file logging object, the simplest approach is to use log_setup()
:
from redata.commons import logger
log_dir = '/mnt/curation'
logfile_prefix = 'mylog'
log = logger.log_setup(log_dir, logfile_prefix)
log.info("print log message")
To only log to stdout, use log_stdout()
:
from redata.commons import logger
log_std = logger.log_stdout()
log_std.info("print log message")
For simplicity, LogCommons
simplifies many of the calls in various scripts
and modules:
from redata.commons import logger, git_info
log_dir = '/mnt/curation'
logfile_prefix = 'mylog'
log = logger.log_setup(log_dir, logfile_prefix)
code_path = "/path/to/repo"
gi = git_info.GitInfo(code_path)
lc = LogCommons(log, 'script_run', gi)
lc.script_start() # Starting log message
lc.script_sys_info() # Retrieves user and hostname metadata and write to log
lc.script_end() # End of script
lc.log_permission() # Change permission of log file to read and write for creator and group
To retrieve the full path of the file log, use get_log_file()
:
from redata.commons import logger
log_dir = '/mnt/curation'
logfile_prefix = 'mylog'
log = logger.log_setup(log_dir, logfile_prefix)
for handler in log.handlers:
log_file = logger.get_log_file(handler)
To retrieve system (OS, IP) and user information, use get_user_hostname()
:
from redata.commons import logger
sys_info_dict = logger.get_user_hostname()
The log_settings
allows for explicit logging of input arguments to
command-line scripts. The below example uses inputs specific to ReQUIAM.
from redata.commons import logger
log_dir = '/mnt/curation'
logfile_prefix = 'mylog'
log = logger.log_setup(log_dir, logfile_prefix)
config_dict = {
'ldap_host': 'eds.iam.arizona.edu',
'ldap_base_dn': 'dc=eds,dc=arizona,dc=edu',
'ldap_user': 'figshare',
'ldap_password': '***override***'
}
vargs = {'ldap_password': 'abcdef123456'}
protected_keys = ['ldap_password']
logger.log_settings(vargs, config_dict, protected_keys, log=log)
Finally, pandas_write_buffer
is often used to provide pandas
DataFrame
in logs:
from redata.commons import logger
import pandas as pd
log_dir = '/mnt/curation'
logfile_prefix = 'mylog'
log = logger.log_setup(log_dir, logfile_prefix)
for handler in log.handlers:
log_filename = logger.get_log_file(handler)
df = pd.read_csv('data.csv') # This is a dummy filename
logger.pandas_write_buffer(df, log_filename)