Chapter 2 Project Python
Chapter 2 Project Python
Chapter 2 Project Python
In recent years, machine learning techniques, particularly deep learning algorithms such as
Long Short-Term Memory (LSTM), have shown promising results in predicting crude oil
prices. LSTMs are a type of recurrent neural network (RNN) that can capture long-term
dependencies in time series data. LSTM networks have been successfully used in various
fields, including finance, to predict stock prices, foreign exchange rates, and commodity
prices.
This report aims to apply LSTM networks to predict crude oil prices. We will explore the
crude oil price data and perform some pre-processing steps to prepare the data for LSTM.
Then, we will develop an LSTM model, train it on historical crude oil price data, and use it to
predict future crude oil prices. Finally, we will evaluate the performance of the LSTM model
and compare it with other traditional time-series forecasting models.
2. `tensorflow`: an open-source library for machine learning and deep learning. It allows
users to build and train various neural network architectures, including recurrent neural
networks like LSTM.
3. `pandas`: a library for data manipulation and analysis in Python. It provides data structures
for efficiently storing and manipulating large datasets, as well as functions for data cleaning
and transformation.
4. `matplotlib`: a library for creating visualizations and plots in Python. It provides a variety
of functions for creating line plots, scatter plots, histograms, and more.
5. `sklearn` (short for "Scikit-learn"): a library for machine learning and data analysis in
Python. It provides a variety of tools for data preprocessing, model selection, and evaluation.
6. `seaborn`: a library for creating visualizations and plots in Python. It provides a high-level
interface for creating aesthetically pleasing and informative visualizations.
The data is then loaded and pre-processed, where the date column is converted to datetime
format, a subset of the data is selected, and the data is standardized using the StandardScaler
function from sklearn. The training and testing data is then created by creating sequences of
past data for input and future data for output.
A sequential LSTM model is then built with two LSTM layers, a dropout layer, and a dense
layer. The model is then compiled with the Adam optimizer and mean squared error loss
function. The model is trained with the training data and validation split of 0.1 for 5 epochs
with a batch size of 16.
The predicted prices for the next 15 days are generated using the trained model and the
CustomBusinessDay and USFederalHolidayCalendar functions from pandas are used to
predict future business days. The predicted prices are then inverse transformed and plotted
against the original data using seaborn and matplotlib.
The script provides a useful example for implementing a time series forecasting model using
LSTM neural networks with a real-world financial dataset.