sciope.data package

Submodules

sciope.data.dataset module

Dataset Class

class sciope.data.dataset.DataSet(name)[source]

Bases: object

Class for defining a dataset for a modeling/optimization/inference run

Properties/variables: * x (inputs) * y (targets) * ts (time series) * s (summary statistics) * outlier_column_indices (columns containing outliers) * size * configurations (OrderedDict with relevant information)

Methods: * get_size (returns current size of the dataset) * add_points (add data to the dataset, data can be added incrementally) * process_outliers (check summary stats that contain outliers, and apply log scaling) * apply_func_to_columns (Applies a transformation function to selected column indices of a matrix)

add_points(inputs=None, targets=None, time_series=None, summary_stats=None)[source]

Updates the dataset to include new points

inputsndarray, optional

Usually parameter points, by default None

targetsndarray, optional

The target for inferene/optimazation/exploration, by default None

time_seriesndarray, optional

Simulation output trajectories, by default None

summary_statsndarray, optional

The summary statistics, by default None

ValueError

If all function args are None

static apply_func_to_columns(func, matrix, idx)[source]

Applies a transformation function to selected column indices of a matrix

funccallable

the transformation function

matrixndarray

matrix to be processed

idxndarray

indices of the matrix to be transformed

ndarray

the transformed matrix

ValueError

[description]

get_size()[source]

Returns the current number of points in the dataset

int

The current number of points in the dataset

process_outliers(mode='zscore')[source]

Check for outliers in calculated summary stats. Outliers are the few very high or very low values that can potentially introduce bias in tasks such as parameter inference. One can either remove them, replace with mean value, or use log scale for the statistic in question. This choice is left to the user.

modestr, optional

Either use ‘z-score’ or inter-quantile range ‘iqr’, by default ‘zscore’

array

Indices of dataset.s columns containing outliers

Module contents