# Chapter with code

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

## Prepare data

```{admonition} See also:
:class: seealso
All the commands used here are described in the [pandas documentation](https://pandas.pydata.org/docs/index.html)
```

First load a dataset [from github](https://github.com/jenfly/opsd):

In [None]:
data = pd.read_csv('https://github.com/jenfly/opsd/raw/master/opsd_germany_daily.csv')

And take a look:

In [None]:
data

In [None]:
data.dtypes

Let's make the Date column to a `datetime`

In [None]:
data['Date'] = pd.to_datetime(data['Date'])

And set it as the index:

In [None]:
data.set_index('Date', inplace=True)

In [None]:
data

What are the units? See [here](https://github.com/jenfly/opsd/blob/master/time-series-preprocessing.ipynb): it's $GWh$.<br>Rename columns accordingly:

In [None]:
data.rename(axis=1, mapper={col: col+' [GWh]' for col in data.columns}, inplace=True)

In [None]:
data

Looks fine.

## Examine data

Now let's have a look at a plot:

In [None]:
plt.figure(figsize=(16,7))
for col in data.columns:
    plt.plot(data[col], label=col.rstrip('[GWh]'))
plt.title('Consumption and generation energy data',fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Energy [GWh]', fontsize=14)
plt.grid()
plt.legend();

### Total energy consumption

__How much energy was consumed each year?__

In [None]:
data['year'] = data.index.year

In [None]:
# divide by 1000 => TWh
consumptions_years = data.groupby(axis=0, by='year').sum()/1000

In [None]:
# rename columns accordingly
consumptions_years.rename(axis=1, mapper={col: col.replace('GWh', 'TWh') for col in data.columns},
                          inplace=True)

In [None]:
plt.figure(figsize=(16,7))
plt.bar(x=consumptions_years.index, height=consumptions_years[consumptions_years.columns[0]])
plt.title('Consumption energy data',fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Energy [TWh]', fontsize=14)
plt.grid()

About the same each year

```{admonition} See also
:class: seealso
- Website of the [Umweltbundesamt](https://www.umweltbundesamt.de/daten/energie/stromverbrauch)
- Website of [SMARD](https://www.smard.de/home/marktdaten?marketDataAttributes=%7B%22resolution%22:%22year%22,%22from%22:1420066800000,%22to%22:1483311599999,%22moduleIds%22:%5B5000410%5D,%22selectedCategory%22:null,%22activeChart%22:true,%22style%22:%22color%22,%22region%22:%22DE%22%7D)
```

__How is the consumption for one year?__

In [None]:
plt.figure(figsize=(16,7))
plt.plot(data.loc['2017', 'Consumption [GWh]'], '-x')
plt.title('Consumption energy data',fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Energy [GWh]', fontsize=14)
plt.grid()

Looks like 52 weeks a year, with the weekends having lower demand.<br>Look for the mean consumtion per weekday:

In [None]:
data2017daily = pd.DataFrame(data.loc['2017', 'Consumption [GWh]'])
data2017daily['day'] = data2017daily.index.weekday

# mean for the weekdays
data2017dailyGrouped = data2017daily.groupby(axis=0, by='day').mean()

In [None]:
plt.figure(figsize=(16,7))
plt.bar(x=data2017dailyGrouped.index, height=data2017dailyGrouped['Consumption [GWh]'])
plt.title('Consumption energy data',fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Energy [GWh]', fontsize=14)
plt.grid()

```{admonition} Hint
:class: hint
Numbers 0 to 6 represent the weekdays monday to sunday (see [pandas](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DatetimeIndex.html?highlight=datetimeindex#pandas.DatetimeIndex))
```

Confirmed: weekends have lower consumption.

### Total energy generation by renewables

__how much energy was produced each year?__

In [None]:
bw = 0.35
plt.figure(figsize=(16,7))
plt.bar(x=consumptions_years.index-bw/2, height=consumptions_years[consumptions_years.columns[1]], label='wind',
       width=bw)
plt.bar(x=consumptions_years.index+bw/2, height=consumptions_years[consumptions_years.columns[2]], label='solar',
       width=bw)

plt.title('Generation energy data',fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Energy [TWh]', fontsize=14)
plt.grid()
plt.legend();

```{admonition} Info
:class: hint
There was already generation by renewables before 2010, there is just no data provided in the dataset
```

Significant rise in generation by windpower since 2014, only slow rise in generation by pv.