作者: Raoul Malm
描述:
本笔记本演示了如何使用 TensorFlow 中的循环神经网络预测不同股票的未来价格。实现了带有基本单元、LSTM 或 GRU 单元的循环神经网络。
大纲:
- [库和设置]
- [分析数据]
- [操作数据]
- [建模和验证数据]
- [预测]
参考:
https://www.kaggle.com/benjibb/lstm-stock-prediction-20170507/notebook
1. Libraries and settings
import numpy as np
import pandas as pd
import math
import sklearn
import sklearn.preprocessing
import datetime
import os
import matplotlib.pyplot as plt
import tensorflow as tf# split data in 80%/10%/10% train/validation/test sets
valid_set_size_percentage = 10
test_set_size_percentage = 10 #display parent directory and working directory
# print(os.path.dirname(os.getcwd())+':', os.listdir(os.path.dirname(os.getcwd())));
# print(os.getcwd()+':', os.listdir(os.getcwd()));
2. 分析数据
- 从 prices-split-adjusted.csv 加载股票价格
- 分析数据
# import all stock prices
df = pd.read_csv("./prices-split-adjusted.csv", index_col = 0)
df.info()
df.head()# number of different stocks
print('\nnumber of different stocks: ', len(list(set(df.symbol))))
print(list(set(df.symbol))[:10])
<class 'pandas.core.frame.DataFrame'> Index: 851264 entries, 2016-01-05 to 2016-12-30 Data columns (total 6 columns):# Column Non-Null Count Dtype --- ------ -------------- ----- 0 symbol 851264 non-null object 1 open 851264 non-null float642 close 851264 non-null float643 low 851264 non-null float644 high 851264 non-null float645 volume 851264 non-null float64 dtypes: float64(5), object(1) memory usage: 45.5+ MBnumber of different stocks: 501 ['JNJ', 'ZION', 'CNC', 'BBBY', 'CTL', 'PHM', 'FTR', 'MOS', 'TSCO', 'SWN']
In [3]:
df.tail()
Out[3]:
symbol | open | close | low | high | volume | |
---|---|---|---|---|---|---|
date | ||||||
2016-12-30 | ZBH | 103.309998 | 103.199997 | 102.849998 | 103.930000 | 973800.0 |
2016-12-30 | ZION | 43.070000 | 43.040001 | 42.689999 | 43.310001 | 1938100.0 |
2016-12-30 | ZTS | 53.639999 | 53.529999 | 53.270000 | 53.740002 | 1701200.0 |
2016-12-30 | AIV | 44.730000 | 45.450001 | 44.410000 | 45.590000 | 1380900.0 |
2016-12-30 | FTV | 54.200001 | 53.630001 | 53.389999 | 54.480000 | 705100.0 |