-
Notifications
You must be signed in to change notification settings - Fork 2
split
Ned Taylor edited this page Apr 12, 2024
·
3 revisions
split(
data=None,
list=None,
left_data,
right_data,
left_label,
right_label,
dim,
left_size,
right_size,
shuffle=.false.,
seed=0,
split_list
)
The split
interface offers a method of splitting a dataset into left and right sets (often used for splitting a dataset into separate training and testing datasets).
Split is an interface to multiple procedures, depending on the types, and whether input and label data is provided.
- data: An integer or real array of dimensions n (n=3 or 5). The input features dataset.
- label: An integer or real array of dimension 1. The input dataset labels (expected output).
-
left_data: Output left split of
data
. -
right_data: Output right split of
data
. -
left_label: Output left split of
label
. -
right_label: Output right split of
label
. - dim: Dimension along which to split data (i.e. the sample index dimension).
-
left_size: Fractional size of left data split. WARNING: only provide
left_size
orright_size
. If both are provided and do not sum to 1,right_size
will be readjusted to meet this criteria. -
right_size: Fractional size of right data split. WARNING: only provide
left_size
orright_size
. If both are provided and do not sum to 1,right_size
will be readjusted to meet this criteria - shuffle: Boolean whether to shuffle dataset.
-
seed: An integer scalar. Random number generator seed.
Default=0
. -
split_list: Optional: An integer list. An output list of length equal to the number of samples/records in the dataset. Each element contains either a 1 or a 2, referring to whether the original data has been put into the
left_
orright_
storage, respectively.