Hourly Temperature Forecasting
Studying trends in the weather recorded at the Max Planck Institute.
- Project At A Glance
- Dependencies
- Dataset Initialization
- Temperature Plot (degC)
- Time-Series Window
- Train-Test Split
- Model Setup, Layers and Callbacks
- Predictions and Variance
- Results and Visualization
Project At A Glance
Objective
: Iteratively forecast hour-wise temperature for a region over a fairly large time-period (8 years in this case).
Data
: Time-Series Weather Dataset at the Max Planck Institute in Jena, Germany. [Download]
Implementation
: Time-Series Forecasting, Seqeuntial Long Short-Term Memory (LSTM)
Results
:
- Clear trends in data showing changes in the climate across the time of the year.
- DataFrame with variance between Actual Values and Predicted Values for the test and validation sets.
- Visualizations to judge model's performance.
Deployment
: View this project on GitHub.
import os
import numpy as np
import pandas as pd
import tensorflow as tf
zip_path = tf.keras.utils.get_file(
origin='https://storage.googleapis.com/tensorflow/tf-keras-datasets/jena_climate_2009_2016.csv.zip',
fname='jena_climate_2009_2016.csv.zip',
extract=True)
csv_path, _ = os.path.splitext(zip_path)
df = pd.read_csv(csv_path)
df
df = df[5::6]
df
df.index = pd.to_datetime(df['Date Time'], format = '%d.%m.%Y %H:%M:%S')
df[:6]
df1 = df['T (degC)']
df1.plot()
def create_dataset(df, window_size=5):
df_as_np = df.to_numpy()
X = []
y = []
for i in range(len(df_as_np)-window_size):
row = [[a] for a in df_as_np[i:i+window_size]]
X.append(row)
label = df_as_np[i+window_size]
y.append(label)
return np.array(X), np.array(y)
window_size = 5
X, y = create_dataset(df1, window_size)
X.shape, y.shape
X_train, y_train = X[:60000], y[:60000]
X_val, y_val = X[60000:65000], y[60000:65000]
X_test, y_test = X[65000:], y[65000:]
X_train.shape, y_train.shape, X_val.shape, y_val.shape, X_test.shape, y_test.shape
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import *
from tensorflow.keras.callbacks import ModelCheckpoint
from tensorflow.keras.losses import MeanSquaredError
from tensorflow.keras.metrics import RootMeanSquaredError
from tensorflow.keras.optimizers import Adam
model = Sequential()
model.add(InputLayer((5, 1)))
model.add(LSTM(64))
model.add(Dense(8, 'relu'))
model.add(Dense(1, 'linear'))
model.summary()
cp1 = ModelCheckpoint('model/', save_best_only=True)
model.compile(loss=MeanSquaredError(), optimizer=Adam(learning_rate=0.0001), metrics=[RootMeanSquaredError()])
model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=12, callbacks=[cp1])
from tensorflow.keras.models import load_model
model = load_model('model')
train_predictions = model.predict(X_train).flatten()
train_results = pd.DataFrame(data={'Train Predictions':train_predictions, 'Actuals':y_train})
train_results
test_predictions = model.predict(X_test).flatten()
test_results = pd.DataFrame(data={'Test Predictions':test_predictions, 'Actuals':y_test})
test_results
val_predictions = model.predict(X_val).flatten()
val_results = pd.DataFrame(data={'Val Predictions':val_predictions, 'Actuals':y_val})
val_results
import matplotlib.pyplot as plt
plt.plot(train_results['Train Predictions'][50:100])
plt.plot(train_results['Actuals'][50:100])
plt.plot(test_results['Test Predictions'][:100])
plt.plot(test_results['Actuals'][:100])
plt.plot(val_results['Val Predictions'][:100])
plt.plot(val_results['Actuals'][:100])