Using Deep Learning to Analyze Candlestick Charts for Stock Prediction

Stock market analysis is fundamentally built on three core methodologies: technical, fundamental, and quantitative (capital flow) analysis. Each approach offers unique insights and strategies for stock selection, and none can be deemed universally superior. The choice often depends on an individual's trading psychology, personal discipline, and investment goals. From a returns perspective, it's not just about buying right (high returns) but also selling timely (high Internal Rate of Return - IRR), as the value of returns diminishes over time.

For those who prioritize the impact of time on stock movements, daily candlestick charts hold valuable clues. Candlestick pattern analysis decodes the psychological battle between bulls and bears through five elements: open, close, high, low prices, and trading volume. This method, rooted in the historic "Sakata Strategies" by Japanese trader Homma Munehisa, has been extensively documented. However, market psychology evolves, and so do candlestick patterns. The question is: can modern technology help us interpret hundreds of complex, changing candlestick charts each day?

How Long Short-Term Memory (LSTM) Networks Work

Long Short-Term Memory (LSTM) networks are a type of Recurrent Neural Network (RNN) specially designed to process and predict time-series data with long-range dependencies. Consider this sentence:

“I grew up in France, so I speak fluent??”

The words "France" and "speak" provide contextual clues, allowing an LSTM model to predict "French" as the most likely missing word by leveraging both short and long-term signals.

Similarly, stock prices react to temporal sequences and critical signals embedded in candlestick patterns:

A large bullish candle with high volume after a low-volatility consolidation may indicate an upward breakout.
An island reversal pattern following a gap might signal a upcoming decline.
A doji candle with long wicks after several rising days could suggest a potential trend reversal.

LSTM models learn to identify such significant signals within a sequence of candlestick data and predict subsequent price movements.

Implementing LSTM for Stock Price Forecasting

In this practical example, we use historical data from Hon Hai Precision Industry (Foxconn, 2317 TW) from 2013 to 2017, including open, close, high, low prices, and volume.

Data Preprocessing

The first step is data loading and cleaning using pandas:

import pandas as pd
foxconndf = pd.read_csv('./foxconn_2013-2017.csv', index_col=0)
foxconndf.dropna(how='any', inplace=True)

To ensure stable model training, we normalize the data to a [0,1] range using min-max scaling:

from sklearn import preprocessing
def normalize(df):
    newdf = df.copy()
    min_max_scaler = preprocessing.MinMaxScaler()
    for col in ['open', 'low', 'high', 'volume', 'close']:
        newdf[col] = min_max_scaler.fit_transform(df[col].values.reshape(-1,1))
    return newdf
foxconndf_norm = normalize(foxconndf)

Data Preparation for Time-Series Learning

We structure the data into time-series samples with a defined window length:

import numpy as np
def data_helper(df, time_frame):
    number_features = len(df.columns)
    datavalue = df.values
    result = []
    for index in range(len(datavalue) - (time_frame+1)):
        result.append(datavalue[index: index + (time_frame+1)])
    result = np.array(result)
    number_train = round(0.9 * result.shape[0])
    x_train = result[:int(number_train), :-1]
    y_train = result[:int(number_train), -1][:,-1]
    x_test = result[int(number_train):, :-1]
    y_test = result[int(number_train):, -1][:,-1]
    x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], number_features))
    x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], number_features))
    return [x_train, y_train, x_test, y_test]

X_train, y_train, X_test, y_test = data_helper(foxconndf_norm, 20)

Building the LSTM Model with Keras

We construct a network with two LSTM layers followed by fully connected layers:

from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM
def build_model(input_length, input_dim):
    model = Sequential()
    model.add(LSTM(256, input_shape=(input_length, input_dim), return_sequences=True))
    model.add(Dropout(0.3))
    model.add(LSTM(256, return_sequences=False))
    model.add(Dropout(0.3))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(1, activation='linear'))
    model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
    return model

model = build_model(20, 5)

Model Training and Prediction

The model is trained using the prepared training data:

model.fit(X_train, y_train, batch_size=128, epochs=50, validation_split=0.1, verbose=1)

After training, we generate predictions and denormalize the values to original price scale:

def denormalize(df, norm_value):
    original_value = df['close'].values.reshape(-1,1)
    min_max_scaler = preprocessing.MinMaxScaler()
    min_max_scaler.fit_transform(original_value)
    denorm_value = min_max_scaler.inverse_transform(norm_value.reshape(-1,1))
    return denorm_value

pred = model.predict(X_test)
denorm_pred = denormalize(foxconndf, pred)
denorm_ytest = denormalize(foxconndf, y_test)

Evaluating Prediction Results

Visualizing the results shows the predicted versus actual prices:

import matplotlib.pyplot as plt
plt.plot(denorm_pred, color='red', label='Prediction')
plt.plot(denorm_ytest, color='blue', label='Actual')
plt.legend(loc='best')
plt.show()

Initial results often show a lagging prediction, which is common in financial time-series forecasting. To improve accuracy, consider adjusting:

The input time window length
Activation functions and optimizers in the network
Network architecture, layer types, and neuron counts
Batch size and number of training epochs

After parameter tuning, predictions can become more aligned with actual movements. 👉 Explore advanced tuning strategies

Frequently Asked Questions

What is the main advantage of using LSTM for stock prediction?
LSTM networks excel at capturing long-term dependencies in sequential data, making them suitable for recognizing complex patterns in financial time-series that traditional models might miss.

How much historical data is needed to train a reliable model?
While there's no fixed rule, generally 3-5 years of daily data is a good starting point. The quality and relevance of data matter more than sheer volume for achieving reliable predictions.

Can this model predict sudden market crashes or black swan events?
Like most data-driven models, LSTMs struggle with unprecedented events since they learn from historical patterns. They are better suited for forecasting under relatively normal market conditions.

What are alternatives to LSTM for time-series forecasting?
Other approaches include GRUs (Gated Recurrent Units), Transformer models, and traditional statistical methods like ARIMA. Each has strengths depending on data characteristics and prediction goals.

How often should the model be retrained with new data?
Regular retraining is essential—typically quarterly or annually—to adapt to evolving market conditions. Continuous learning systems can also update parameters incrementally as new data arrives.

What hardware is recommended for training such models?
While basic models can run on CPUs, GPUs significantly accelerate training for large datasets. Cloud-based ML platforms offer scalable resources for experimenting with different architectures. 👉 View real-time analytical tools

Conclusion

Deep learning offers powerful techniques for decoding candlestick charts and predicting stock movements. While LSTMs show promise in recognizing temporal patterns, success requires thoughtful data preprocessing, model design, and continuous parameter optimization. Remember that no model guarantees absolute accuracy—use these tools as part of a diversified analytical approach combined with risk management principles.