Dataset#

class tailestim.datasets.TailData(name=None, path=None)[source]#

Bases: object

Load and manage tail distribution datasets.

This class provides functionality to load datasets either from the package’s built-in data directory using a name, or from a custom path provided by the user.

Parameters:
namestr, optional

Name of a built-in dataset to load (without file extension). Must be provided if path is None.

pathstr, optional

Path to a custom dataset file. If provided, this takes precedence over name. Must be provided if name is None.

Attributes:
namestr or None

Name of the dataset if a built-in dataset was loaded.

pathstr or None

Path to the dataset file if a custom dataset was loaded.

datanumpy.ndarray

The loaded dataset as a numpy array.

Examples

Load a built-in dataset:

>>> data = TailData(name='CAIDA_KONECT')
>>> print(len(data.data))

Load a custom dataset:

>>> data = TailData(path='path/to/my/data.dat')
>>> print(len(data.data))
__repr__()[source]#

Return a string representation of the TailData object.

Returns:
str

String representation including the data source and length.

load_data()[source]#

Load data from either a built-in dataset or a custom file path.

Returns:
numpy.ndarray

The loaded dataset as a numpy array.

Raises:
FileNotFoundError

If the specified dataset file cannot be found.