Skip to content

Datasets

It provides the tools to extract sports betting data.

BaseDataLoader(param_grid=None)

The base class for dataloaders.

Warning: This class should not be used directly. Use the derive classes instead.

Source code in src/sportsbet/datasets/_base.py
64
65
def __init__(self: Self, param_grid: ParamGrid | None = None) -> None:
    self.param_grid = param_grid

extract_fixtures_data()

Extract the fixtures data.

Read more in the user guide.

It returns fixtures data that can be used to make predictions for upcoming matches based on a betting strategy.

Before calling the extract_fixtures_data method for the first time, the extract_training_data should be called, in order to match the columns of the input, output and odds data.

The data contain information about the matches known before the start of the match, i.e. the training data X and the odds data O. The multi-output targets Y is always equal to None and are only included for consistency with the method extract_train_data.

The param_grid parameter of the initialization method has no effect on the fixtures data.

Returns:

Type Description
(X, None, O)

Each of the components represent the fixtures input data X, the multi-output targets Y equal to None and the corresponding odds O, respectively.

Source code in src/sportsbet/datasets/_base.py
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
def extract_fixtures_data(self: Self) -> FixturesData:
    """Extract the fixtures data.

    Read more in the [user guide][dataloader].

    It returns fixtures data that can be used to make predictions for
    upcoming matches based on a betting strategy.

    Before calling the `extract_fixtures_data` method for
    the first time, the `extract_training_data` should be called, in
    order to match the columns of the input, output and odds data.

    The data contain information about the matches known before the
    start of the match, i.e. the training data `X` and the odds
    data `O`. The multi-output targets `Y` is always equal to `None`
    and are only included for consistency with the method `extract_train_data`.

    The `param_grid` parameter of the initialization method has no effect
    on the fixtures data.

    Returns:
        (X, None, O):
            Each of the components represent the fixtures input data `X`, the
            multi-output targets `Y` equal to `None` and the
            corresponding odds `O`, respectively.
    """
    # Extract fixtures data
    if not hasattr(self, 'train_data_'):
        error_msg = 'Extract the training data before extracting the fixtures data.'
        raise AttributeError(error_msg)

    data = self._validate_data()

    # Extract fixtures data
    data = data[data['fixtures']].drop(columns=['fixtures'])

    # Convert data types
    data = self._convert_data_types(data)

    # Remove past data
    data = data.loc[data.index >= pd.Timestamp(pd.to_datetime('today').date())]

    # Extract odds
    O_fix = data[self.odds_cols_].reset_index(drop=True) if self.odds_type_ is not None else None

    self.fixtures_data_ = data[self.input_cols_], None, O_fix

    return self.fixtures_data_

extract_train_data(drop_na_thres=0.0, odds_type=None)

Extract the training data.

Read more in the user guide.

It returns historical data that can be used to create a betting strategy based on heuristics or machine learning models.

The data contain information about the matches that belong in two categories. The first category includes any information known before the start of the match, i.e. the training data X and the odds data O. The second category includes the outcomes of matches i.e. the multi-output targets Y.

The method selects only the the data allowed by the param_grid parameter of the initialization method. Additionally, columns with missing values are dropped through the drop_na_thres parameter, while the types of odds returned is defined by the odds_type parameter.

Parameters:

Name Type Description Default
drop_na_thres float

The threshold that specifies the input columns to drop. It is a float in the [0.0, 1.0] range. Higher values result in dropping more values. The default value drop_na_thres=0.0 keeps all columns while the maximum value drop_na_thres=1.0 keeps only columns with non missing values.

0.0
odds_type str | None

The selected odds type. It should be one of the available odds columns prefixes returned by the method get_odds_types. If odds_type=None then no odds are returned.

None

Returns:

Type Description
(X, Y, O)

Each of the components represent the training input data X, the multi-output targets Y and the corresponding odds O, respectively.

Source code in src/sportsbet/datasets/_base.py
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
def extract_train_data(
    self: Self,
    drop_na_thres: float = 0.0,
    odds_type: str | None = None,
) -> TrainData:
    """Extract the training data.

    Read more in the [user guide][dataloader].

    It returns historical data that can be used to create a betting
    strategy based on heuristics or machine learning models.

    The data contain information about the matches that belong
    in two categories. The first category includes any information
    known before the start of the match, i.e. the training data `X`
    and the odds data `O`. The second category includes the outcomes of
    matches i.e. the multi-output targets `Y`.

    The method selects only the the data allowed by the `param_grid`
    parameter of the initialization method. Additionally, columns with missing
    values are dropped through the `drop_na_thres` parameter, while the
    types of odds returned is defined by the `odds_type` parameter.

    Args:
        drop_na_thres:
            The threshold that specifies the input columns to drop. It is a float in
            the `[0.0, 1.0]` range. Higher values result in dropping more values.
            The default value `drop_na_thres=0.0` keeps all columns while the
            maximum value `drop_na_thres=1.0` keeps only columns with non
            missing values.

        odds_type:
            The selected odds type. It should be one of the available odds columns
            prefixes returned by the method `get_odds_types`. If `odds_type=None`
            then no odds are returned.

    Returns:
        (X, Y, O):
            Each of the components represent the training input data `X`, the
            multi-output targets `Y` and the corresponding odds `O`, respectively.
    """

    # Check param grid
    self._check_param_grid()

    # Validate the data
    data = self._validate_data()

    # Extract train data
    data = self._extract_train_data(data)

    # Check dropped columns
    self.drop_na_thres_ = check_scalar(drop_na_thres, 'drop_na_thres', float, min_val=0.0, max_val=1.0)
    self._check_dropped_na_cols(data, drop_na_thres)

    # Check odds type
    dropped_all_na_cols = data.columns.difference(data.dropna(axis=1, how='all').columns)
    odds_types = sorted({col.split('__')[1] for col in self._cols(data, 'odds') if col not in dropped_all_na_cols})
    if odds_type is not None and odds_type not in odds_types:
        error_msg = (
            f'Parameter `odds_type` should be a prefix of available odds columns. Got `{odds_type}` instead.'
        )
        if isinstance(odds_type, str):
            raise ValueError(error_msg)
        else:
            raise TypeError(error_msg)
    self.odds_type_ = odds_type

    # Extract input, odds and output columns
    output_keys = [col.split('__')[1:] for col, _ in self.OUTPUTS]
    target_keys = [col.split('__')[2:] for col in self._cols(data, 'target')]
    odds_keys = [col.split('__')[2:] for col in self._cols(data, 'odds') if col.split('__')[1] == self.odds_type_]
    output_keys = [
        key
        for key in (odds_keys if self.odds_type_ is not None else output_keys)
        if key in output_keys and key[-1:] in target_keys
    ]
    target_output_keys = list({key for *_, key in output_keys})
    self.input_cols_ = pd.Index(
        [col for col in self._cols(data, 'input') if col not in self.dropped_na_cols_],
        dtype=object,
    )
    self.odds_cols_ = pd.Index(
        [f'odds__{self.odds_type_}__{key1}__{key2}' for key1, key2 in output_keys if self.odds_type_ is not None],
        dtype=object,
    )
    self.output_cols_ = pd.Index(
        [f'output__{key1}__{key2}' for key1, key2 in output_keys if key2 in target_output_keys],
        dtype=object,
    )
    self.target_cols_ = pd.Index(
        [col for col in self._cols(data, 'target') if col.split('__')[-1] in target_output_keys],
        dtype=object,
    )

    # Remove missing target data
    data = data.dropna(subset=self.target_cols_, how='any')

    # Convert data types
    data = self._convert_data_types(data)

    # Extract outputs
    Y_train = []
    outputs_mapping = dict(self.OUTPUTS)
    for col in self.output_cols_:
        func = outputs_mapping[col]
        Y_train.append(pd.Series(func(data[self.target_cols_]), name=col))
    Y_train = pd.concat(Y_train, axis=1).reset_index(drop=True)

    # Extract odds
    O_train = data[self.odds_cols_].reset_index(drop=True) if self.odds_type_ is not None else None

    self.train_data_ = data[self.input_cols_], Y_train, O_train
    if hasattr(self, 'fixtures_data_'):
        delattr(self, 'fixtures_data_')

    return self.train_data_

get_all_params() classmethod

Get the available parameters.

It can be used to get the allowed names and values for the param_grid parameter of the dataloader object.

Returns:

Name Type Description
param_grid list[Param]

list A list of all allowed params and values.

Source code in src/sportsbet/datasets/_base.py
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
@classmethod
def get_all_params(cls: type[BaseDataLoader]) -> list[Param]:
    """Get the available parameters.

    It can be used to get the allowed names and values for the
    `param_grid` parameter of the dataloader object.

    Returns:
        param_grid: list
            A list of all allowed params and values.
    """
    full_param_grid = cls._get_full_param_grid()
    params_names = sorted({param_name for params in full_param_grid for param_name in params})
    all_params = sorted(
        full_param_grid,
        key=lambda params: tuple(
            params.get(name, '' if dict(cls.SCHEMA)[name] is object else 0) for name in params_names
        ),
    )
    return all_params

get_odds_types()

Get the available odds types.

It can be used to get the allowed odds types of the dataloader's method extract_train_data.

Returns:

Name Type Description
odds_types list[str]

A list of available odds types.

Source code in src/sportsbet/datasets/_base.py
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
def get_odds_types(self: Self) -> list[str]:
    """Get the available odds types.

    It can be used to get the allowed odds types of the dataloader's method
    `extract_train_data`.

    Returns:
        odds_types:
            A list of available odds types.
    """
    # Check param grid
    self._check_param_grid()

    # Validate the data
    data = self._validate_data()

    # Extract train data
    data = self._extract_train_data(data)

    # Drop columns with only missing values
    dropped_all_na_cols = data.columns.difference(data.dropna(axis=1, how='all').columns)

    return sorted({col.split('__')[1] for col in self._cols(data, 'odds') if col not in dropped_all_na_cols})

save(path)

Save the dataloader object.

Parameters:

Name Type Description Default
path str

The path to save the object.

required

Returns:

Name Type Description
self Self

The dataloader object.

Source code in src/sportsbet/datasets/_base.py
379
380
381
382
383
384
385
386
387
388
389
390
391
392
def save(self: Self, path: str) -> Self:
    """Save the dataloader object.

    Args:
        path:
            The path to save the object.

    Returns:
        self:
            The dataloader object.
    """
    with Path(path).open('wb') as file:
        cloudpickle.dump(self, file)
    return self

DummySoccerDataLoader(param_grid=None)

Bases: BaseDataLoader

Dataloader for soccer dummy data.

The data are provided only for convenience, since they require no downloading, and to familiarize the user with the methods of the dataloader objects.

Read more in the user guide.

Parameters:

Name Type Description Default
param_grid ParamGrid | None

It selects the type of information that the data include. The keys of dictionaries might be parameters like 'league' or 'division' while the values are sequences of allowed values. It works in a similar way as the param_grid parameter of the scikit-learn's ParameterGrid class. The default value None corresponds to all parameters.

None

Attributes:

Name Type Description
param_grid_ ParameterGrid

The checked value of parameters grid. It includes all possible parameters if param_grid is None.

dropped_na_cols_ Index

The columns with missing values that are dropped.

drop_na_thres_(float) Index

The checked value of drop_na_thres.

odds_type_ str | None

The checked value of odds_type.

input_cols_ Index

The columns of X_train and X_fix.

output_cols_ Index

The columns of Y_train and Y_fix.

odds_cols_ Index

The columns of O_train and O_fix.

target_cols_ Index

The columns used for the extraction of output and odds columns.

train_data_ TrainData

The tuple (X, Y, O) that represents the training data as extracted from the method extract_train_data.

fixtures_data_ FixturesData

The tuple (X, Y, O) that represents the fixtures data as extracted from the method extract_fixtures_data.

Examples:

>>> from sportsbet.datasets import DummySoccerDataLoader
>>> import pandas as pd
>>> # Get all available parameters to select the training data
>>> DummySoccerDataLoader.get_all_params()
[{'division': 1, 'year': 1998}, ...
>>> # Select only the traning data for the Spanish league
>>> dataloader = DummySoccerDataLoader(param_grid={'league': ['Spain']})
>>> # Get available odds types
>>> dataloader.get_odds_types()
['interwetten', 'williamhill']
>>> # Select the odds of Interwetten bookmaker for training data
>>> X_train, Y_train, O_train = dataloader.extract_train_data(
... odds_type='interwetten')
>>> # Extract the corresponding fixtures data
>>> X_fix, Y_fix, O_fix = dataloader.extract_fixtures_data()
>>> # Training and fixtures input and odds data have the same column names
>>> pd.testing.assert_index_equal(X_train.columns, X_fix.columns)
>>> pd.testing.assert_index_equal(O_train.columns, O_fix.columns)
>>> # Fixtures data have always no output
>>> Y_fix is None
True
Source code in src/sportsbet/datasets/_dummy.py
369
370
def __init__(self: Self, param_grid: ParamGrid | None = None) -> None:
    super().__init__(param_grid)

extract_fixtures_data()

Extract the fixtures data.

Read more in the user guide.

It returns fixtures data that can be used to make predictions for upcoming matches based on a betting strategy.

Before calling the extract_fixtures_data method for the first time, the extract_training_data should be called, in order to match the columns of the input, output and odds data.

The data contain information about the matches known before the start of the match, i.e. the training data X and the odds data O. The multi-output targets Y is always equal to None and are only included for consistency with the method extract_train_data.

The param_grid parameter of the initialization method has no effect on the fixtures data.

Returns:

Type Description
(X, None, O)

Each of the components represent the fixtures input data X, the multi-output targets Y equal to None and the corresponding odds O, respectively.

Source code in src/sportsbet/datasets/_dummy.py
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
def extract_fixtures_data(self: Self) -> FixturesData:
    """Extract the fixtures data.

    Read more in the [user guide][dataloader].

    It returns fixtures data that can be used to make predictions for
    upcoming matches based on a betting strategy.

    Before calling the `extract_fixtures_data` method for
    the first time, the `extract_training_data` should be called, in
    order to match the columns of the input, output and odds data.

    The data contain information about the matches known before the
    start of the match, i.e. the training data `X` and the odds
    data `O`. The multi-output targets `Y` is always equal to `None`
    and are only included for consistency with the method `extract_train_data`.

    The `param_grid` parameter of the initialization method has no effect
    on the fixtures data.

    Returns:
        (X, None, O):
            Each of the components represent the fixtures input data `X`, the
            multi-output targets `Y` equal to `None` and the
            corresponding odds `O`, respectively.
    """
    return super().extract_fixtures_data()

extract_train_data(drop_na_thres=0.0, odds_type=None)

Extract the training data.

Read more in the user guide.

It returns historical data that can be used to create a betting strategy based on heuristics or machine learning models.

The data contain information about the matches that belong in two categories. The first category includes any information known before the start of the match, i.e. the training data X and the odds data O. The second category includes the outcomes of matches i.e. the multi-output targets Y.

The method selects only the the data allowed by the param_grid parameter of the initialization method. Additionally, columns with missing values are dropped through the drop_na_thres parameter, while the types of odds returned is defined by the odds_type parameter.

Parameters:

Name Type Description Default
drop_na_thres float

The threshold that specifies the input columns to drop. It is a float in the [0.0, 1.0] range. Higher values result in dropping more values. The default value drop_na_thres=0.0 keeps all columns while the maximum value drop_na_thres=1.0 keeps only columns with non missing values.

0.0
odds_type str | None

The selected odds type. It should be one of the available odds columns prefixes returned by the method get_odds_types. If odds_type=None then no odds are returned.

None

Returns:

Type Description
(X, Y, O)

Each of the components represent the training input data X, the multi-output targets Y and the corresponding odds O, respectively.

Source code in src/sportsbet/datasets/_dummy.py
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
def extract_train_data(
    self: Self,
    drop_na_thres: float = 0.0,
    odds_type: str | None = None,
) -> TrainData:
    """Extract the training data.

    Read more in the [user guide][dataloader].

    It returns historical data that can be used to create a betting
    strategy based on heuristics or machine learning models.

    The data contain information about the matches that belong
    in two categories. The first category includes any information
    known before the start of the match, i.e. the training data `X`
    and the odds data `O`. The second category includes the outcomes of
    matches i.e. the multi-output targets `Y`.

    The method selects only the the data allowed by the `param_grid`
    parameter of the initialization method. Additionally, columns with missing
    values are dropped through the `drop_na_thres` parameter, while the
    types of odds returned is defined by the `odds_type` parameter.

    Args:
        drop_na_thres:
            The threshold that specifies the input columns to drop. It is a float in
            the `[0.0, 1.0]` range. Higher values result in dropping more values.
            The default value `drop_na_thres=0.0` keeps all columns while the
            maximum value `drop_na_thres=1.0` keeps only columns with non
            missing values.

        odds_type:
            The selected odds type. It should be one of the available odds columns
            prefixes returned by the method `get_odds_types`. If `odds_type=None`
            then no odds are returned.

    Returns:
        (X, Y, O):
            Each of the components represent the training input data `X`, the
            multi-output targets `Y` and the corresponding odds `O`, respectively.
    """
    return super().extract_train_data(drop_na_thres, odds_type)

SoccerDataLoader(param_grid=None)

Bases: BaseDataLoader

Dataloader for soccer data.

It downloads historical and fixtures data for various leagues, years and divisions.

Read more in the user guide.

Parameters:

Name Type Description Default
param_grid ParamGrid | None

It selects the type of information that the data include. The keys of dictionaries might be parameters like 'league' or 'division' while the values are sequences of allowed values. It works in a similar way as the param_grid parameter of the scikit-learn's ParameterGrid class. The default value None corresponds to all parameters.

None

Attributes:

Name Type Description
param_grid_ ParameterGrid

The checked value of parameters grid. It includes all possible parameters if param_grid is None.

dropped_na_cols_ Index

The columns with missing values that are dropped.

drop_na_thres_(float) Index

The checked value of drop_na_thres.

odds_type_ str | None

The checked value of odds_type.

input_cols_ Index

The columns of X_train and X_fix.

output_cols_ Index

The columns of Y_train and Y_fix.

odds_cols_ Index

The columns of O_train and O_fix.

target_cols_ Index

The columns used for the extraction of output and odds columns.

train_data_ TrainData

The tuple (X, Y, O) that represents the training data as extracted from the method extract_train_data.

fixtures_data_ FixturesData

The tuple (X, Y, O) that represents the fixtures data as extracted from the method extract_fixtures_data.

Examples:

>>> from sportsbet.datasets import SoccerDataLoader
>>> import pandas as pd
>>> # Get all available parameters to select the training data
>>> SoccerDataLoader.get_all_params()
[{'division': 1, 'league': 'Argentina', ...
>>> # Select only the traning data for the French and Spanish leagues of 2020 year
>>> dataloader = SoccerDataLoader(
... param_grid={'league': ['England', 'Spain'], 'year':[2020]})
>>> # Get available odds types
>>> dataloader.get_odds_types()
['market_average', 'market_maximum']
>>> # Select the market average odds and drop colums with missing values
>>> X_train, Y_train, O_train = dataloader.extract_train_data(
... odds_type='market_average')
>>> # Odds data include the selected market average odds
>>> O_train.columns
Index(['odds__market_average__home_win__full_time_goals',...
>>> # Extract the corresponding fixtures data
>>> X_fix, Y_fix, O_fix = dataloader.extract_fixtures_data()
>>> # Training and fixtures input and odds data have the same column names
>>> pd.testing.assert_index_equal(X_train.columns, X_fix.columns)
>>> pd.testing.assert_index_equal(O_train.columns, O_fix.columns)
>>> # Fixtures data have always no output
>>> Y_fix is None
True
Source code in src/sportsbet/datasets/_soccer/_data.py
164
165
def __init__(self: Self, param_grid: ParamGrid | None = None) -> None:
    super().__init__(param_grid)

extract_fixtures_data()

Extract the fixtures data.

Read more in the user guide.

It returns fixtures data that can be used to make predictions for upcoming matches based on a betting strategy.

Before calling the extract_fixtures_data method for the first time, the extract_training_data should be called, in order to match the columns of the input, output and odds data.

The data contain information about the matches known before the start of the match, i.e. the training data X and the odds data O. The multi-output targets Y is always equal to None and are only included for consistency with the method extract_train_data.

The param_grid parameter of the initialization method has no effect on the fixtures data.

Returns:

Type Description
(X, None, O)

Each of the components represent the fixtures input data X, the multi-output targets Y equal to None and the corresponding odds O, respectively.

Source code in src/sportsbet/datasets/_soccer/_data.py
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
def extract_fixtures_data(self: Self) -> FixturesData:
    """Extract the fixtures data.

    Read more in the [user guide][dataloader].

    It returns fixtures data that can be used to make predictions for
    upcoming matches based on a betting strategy.

    Before calling the `extract_fixtures_data` method for
    the first time, the `extract_training_data` should be called, in
    order to match the columns of the input, output and odds data.

    The data contain information about the matches known before the
    start of the match, i.e. the training data `X` and the odds
    data `O`. The multi-output targets `Y` is always equal to `None`
    and are only included for consistency with the method `extract_train_data`.

    The `param_grid` parameter of the initialization method has no effect
    on the fixtures data.

    Returns:
        (X, None, O):
            Each of the components represent the fixtures input data `X`, the
            multi-output targets `Y` equal to `None` and the
            corresponding odds `O`, respectively.
    """
    return super().extract_fixtures_data()

extract_train_data(drop_na_thres=0.0, odds_type=None)

Extract the training data.

Read more in the user guide.

It returns historical data that can be used to create a betting strategy based on heuristics or machine learning models.

The data contain information about the matches that belong in two categories. The first category includes any information known before the start of the match, i.e. the training data X and the odds data O. The second category includes the outcomes of matches i.e. the multi-output targets Y.

The method selects only the the data allowed by the param_grid parameter of the initialization method. Additionally, columns with missing values are dropped through the drop_na_thres parameter, while the types of odds returned is defined by the odds_type parameter.

Parameters:

Name Type Description Default
drop_na_thres float

The threshold that specifies the input columns to drop. It is a float in the [0.0, 1.0] range. Higher values result in dropping more values. The default value drop_na_thres=0.0 keeps all columns while the maximum value drop_na_thres=1.0 keeps only columns with non missing values.

0.0
odds_type str | None

The selected odds type. It should be one of the available odds columns prefixes returned by the method get_odds_types. If odds_type=None then no odds are returned.

None

Returns:

Type Description
(X, Y, O)

Each of the components represent the training input data X, the multi-output targets Y and the corresponding odds O, respectively.

Source code in src/sportsbet/datasets/_soccer/_data.py
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
def extract_train_data(
    self: Self,
    drop_na_thres: float = 0.0,
    odds_type: str | None = None,
) -> TrainData:
    """Extract the training data.

    Read more in the [user guide][dataloader].

    It returns historical data that can be used to create a betting
    strategy based on heuristics or machine learning models.

    The data contain information about the matches that belong
    in two categories. The first category includes any information
    known before the start of the match, i.e. the training data `X`
    and the odds data `O`. The second category includes the outcomes of
    matches i.e. the multi-output targets `Y`.

    The method selects only the the data allowed by the `param_grid`
    parameter of the initialization method. Additionally, columns with missing
    values are dropped through the `drop_na_thres` parameter, while the
    types of odds returned is defined by the `odds_type` parameter.

    Args:
        drop_na_thres:
            The threshold that specifies the input columns to drop. It is a float in
            the `[0.0, 1.0]` range. Higher values result in dropping more values.
            The default value `drop_na_thres=0.0` keeps all columns while the
            maximum value `drop_na_thres=1.0` keeps only columns with non
            missing values.

        odds_type:
            The selected odds type. It should be one of the available odds columns
            prefixes returned by the method `get_odds_types`. If `odds_type=None`
            then no odds are returned.

    Returns:
        (X, Y, O):
            Each of the components represent the training input data `X`, the
            multi-output targets `Y` and the corresponding odds `O`, respectively.
    """
    return super().extract_train_data(drop_na_thres=drop_na_thres, odds_type=odds_type)

load_dataloader(path)

Load the dataloader object.

Parameters:

Name Type Description Default
path str

The path of the dataloader pickled file.

required

Returns:

Name Type Description
dataloader BaseDataLoader

The dataloader object.

Source code in src/sportsbet/datasets/_base.py
440
441
442
443
444
445
446
447
448
449
450
451
452
453
def load_dataloader(path: str) -> BaseDataLoader:
    """Load the dataloader object.

    Args:
        path:
            The path of the dataloader pickled file.

    Returns:
        dataloader:
            The dataloader object.
    """
    with Path(path).open('rb') as file:
        dataloader = cloudpickle.load(file)
    return dataloader