Using long short-term memory neural networks to analyze SEC 13D filings: a recipe for human and machine interaction

Document Type



active learning; computational linguistics; neural networks; shareholder activism

Identifier Data



John Wiley & Sons


We implement an efficient methodology for extracting themes from Securities Exchange Commission 13D filings using aspects of human‐assisted active learning and long short‐term memory (LSTM) neural networks. Sentences from the ‘Purpose of Transaction’ section of each filing are extracted and a randomly chosen subset is labelled based on six filing themes that the existing literature on shareholder activism has shown to have an impact on stock returns. We find that an LSTM neural network that accepts sentences as input performs significantly better, with precision of 77%, than an alternately specified neural network that uses the common bag of words approach. This indicates that both sentence structure and vocabulary are important in classifying SEC 13D filings. Our study has important implications, as it addresses the recent cautions raised in the literature that analysis of finance and accounting‐related text sources should move beyond bag‐of‐words approaches to alternatives that incorporate the analysis of word sense and meaning reflecting context.