commonpower.data_forecasting.data_sources.PandasDataSource
- class PandasDataSource(data: DataFrame, frequency: timedelta = datetime.timedelta(seconds=3600))[source]
Bases:
DataSourceData source based on a pandas dataframe.
- Parameters:
data (pd.DataFrame) – Dataframe containing the data. The index needs to be a datetime index.
frequency (timedelta, optional) – Frequency of the data. Defaults to timedelta(hours=1).
Methods
Allows applying a transformation to a column of the data (using pandas df.apply()).
Creates time features from the datetime index.
Returns the date range data is available for.
Returns the limits for each variable in the data source.
Returns the list of element names that data is available for.
Shifts time series by a given timedelta.
- __call__(from_time: datetime, to_time: datetime) ndarray[source]
Return the data in this date range.
- Parameters:
from_time (datetime) – Start time of observation.
to_time (datetime) – End time of observation.
- Returns:
np.ndarray – Data of shape (n_horizon, n_vars).
- apply_to_column(column: str, fcn: callable) PandasDataSource[source]
Allows applying a transformation to a column of the data (using pandas df.apply()).
- Parameters:
column (str) – Column to manipulate.
fcn (callable) – Transformation to apply. The fcn needs to take one argument which refers to the value of a cell: fcn(x).
- Returns:
DataSource – self
- create_time_features(month: bool = True, day: bool = True, hour: bool = True) PandasDataSource[source]
Creates time features from the datetime index. The features are encoded cyclically via sin and cos transformations. The created features are (if enabled): month_sin, month_cos, day_sin, day_cos, hour_sin, hour_cos
- Parameters:
month (bool, optional) – If True, the month is added as a feature. Defaults to True.
day (bool, optional) – If True, the weekday is added as a feature. Defaults to True.
hour (bool, optional) – If True, the hour is added as a feature. Defaults to True.
- Returns:
PandasDataSource – self
- get_date_range() List[datetime][source]
Returns the date range data is available for.
- Returns:
List[datetime] – [start_date, end_date]
- get_limits() dict[str, tuple[float, float]][source]
Returns the limits for each variable in the data source.
- Returns:
dict[str, tuple[float, float]] –
- {“element1”: (lower_bound, upper_bound),
”element2”: (lower_bound, upper_bound)}
- get_variables() List[str][source]
Returns the list of element names that data is available for.
- Returns:
List[str] – List of available elements.
- shift_time_series(shift_by: timedelta) DataSource[source]
Shifts time series by a given timedelta. The shift is done in a rolling fashing such that the start and end timestamps do not change. Can be used to simulate more diverse data.
- Parameters:
shift_by (timedelta) – Time delta to shift by. Posititve values shift into the “future”, negative into the “past”.
- Returns:
DataSource – self