Aggregated EtudELEC Data: General Discussion

The first aggregation of the EtudELEC data can now be found here on the OTE collection on recherche.data.gouv here: https://doi.org/10.57745/WIWMMK

The data can be cited as follows:

Osonuga,Seun; Imard, Vincent; Boisseau, Christophe; Wurtz, Frederic; Delinchant, Benoit, 2024, “EtudELEC Data, aggregated electricity consumption data from 400+ residential customers in France”, https://doi.org/10.57745/WIWMMK, Recherche Data Gouv, V1

Please feel free to suggest other filters that you are interest in the “New Filters” discussion here: New Filters for EtudELEC

Do not hesitate to ask questions or make comments below.

1 « J'aime »

Posting a question on the EtudELEC dataset from @Dalton_Brendan_Ncube here:

I analyzed the etudelec energy consumption dataset and the trends seem strange or atleast deviate from the norm, i am looking for assistance to understand them

The following are graphs for each heating type

  1. energy consumption throughout the study period obtained using resample ( resampling data into hourly intervals since it is given in 30 mins intervals.
  2. Average hourly consumption obtained by groupby: Grouping the data by hour and calculating the average energy consumption per hour

14 files​

All the groups except apartments and houses that use gas and apartments that use district heating show the same variation in daily energy use (albeit with different levels) as dwellings that use electric heating as seen in the picture below.

Plotting the historically temperature data with the energy should show the coupling more clearly. There’s historically monthly data here: Données Publiques de Météo-France - Données SYNOP essentielles OMM

the values of energy consumption , seem a bit low compared to what i got, may i have a brief description of the model you used

here is how i resampled and grouped the data respectively

#Loading the Data
houses = pd.read_excel(‘electricity_house-summary.xlsx’)
apartments = pd.read_excel(‘electricity_apartment-summary.xlsx’)

Preprocessing the Data to handle missing data

houses[‘timestamp’] = pd.to_datetime(houses[‘timestamp’], utc=True)
apartments[‘timestamp’] = pd.to_datetime(apartments[‘timestamp’], utc=True)

houses.fillna(method=‘ffill’, inplace=True)
apartments.fillna(method=‘ffill’, inplace=True)

Setting timestamp as index for resampling

houses.set_index(‘timestamp’, inplace=True)
apartments.set_index(‘timestamp’, inplace=True)

#Analyzing Energy Consumption Patterns

Resampling data to hourly intervals since its given in 30 mins intervals ikn the original dataset

houses_hourly = houses.resample(‘H’).mean()
apartments_hourly = apartments.resample(‘H’).mean()

Visualizing resampled energy consumption

plt.figure(figsize=(10, 5))
plt.plot(houses_hourly.index, houses_hourly[‘Mean’], label=‘Houses’)
plt.plot(apartments_hourly.index, apartments_hourly[‘Mean’], label=‘Apartments’)
plt.xlabel(‘Time’)
plt.ylabel(‘Energy Consumption (kWh)’)
plt.title(‘Hourly Energy Consumption: Electricity Heating’)
plt.legend()
plt.show()

#Grouping the data by hour and calculating the average energy consumption per hour

houses_hourly_avg = houses_hourly.groupby(‘hour’)[‘Mean’].mean()
apartments_hourly_avg = apartments_hourly.groupby(‘hour’)[‘Mean’].mean()

Plotting average hourly consumption

houses_hourly_avg.plot(label=‘Houses’, legend=True)
apartments_hourly_avg.plot(label=‘Apartments’, legend=True)
plt.title(‘Average Hourly Energy Consumption’)
plt.xlabel(‘Hour of Day’)
plt.ylabel(‘Energy Consumption (kWh)’)
plt.show()

The base data is in Watts. So after resampling per hour for mean, you should divide by 1000

1 « J'aime »

This is what the groups daily energy consumption looks like mapped against temperature data from Orleans. The temperature data was sourced from: Le Gardien du Temps | Meteostat

1 « J'aime »

Dear @seun_osanuga,
Is it possible to have the anonymized electricity consumption data for both apartments and households per point of delivery (eg. 15 minutes electricity consumption data per consumers, indicating as well whether it is an apartment or house)?
We are evaluating the effect of demend aggregation on photovoltaic self consumption.
An example of our research is: https://link.springer.com/chapter/10.1007/978-3-032-10546-2_78
Would be grateful!

Hello @gergelylaszlo ,

Given that we published just the aggregated version, we will need validation from our ethics committee to share the individual time series with researchers outside OTE. The request has been made but might take some time.

However, we do have another dataset that you might find useful that has 150+ individual time-series (with a ~1 min timestep) and definitions of the subject (number of inhabitants, house/apartment, among others) here: Etude xKy: Données issues de compteurs linky avec échantillonnage inférieur à la minute - Observatoire de la Transition Énergétique

Let me know if this works.

Hi @seun_osonuga!

For the research a whole year period of measurement is necessary. As far as I see, the link has measurements for roughly a half year period. Nevertheless, based on this link it seems that data acquisition is still ongoing: xKy en temps réel | OTE - Université Grenoble Alpes .

Is it possible to have extended measurement period - a full year?