Novel Database is Better at Measuring and Predicting Core Earnings and Stock Prices


A recently published study by three Harvard Business School and MIT Sloan professors based on a novel database provided by New Constructs shows that (a) markets do not efficiently assess core earnings, (b) there is significant growth in disclosure of material unusual items, and (c) incorporating these material, but hard-to-find unusual items enables analysts to measure and predict companies’ core earnings – a key measure of a firm’s current and future financial performance – more accurately than traditional fundamental databases.

Key Findings of this New Study

The new academic paper written by professors Ethan Rouen and Charles C.Y. Wang of Harvard Business School, and Eric So of MIT’s Sloan School of Management is called Core Earnings: New Data and Evidence.  This study analyzes core earnings using independent research provider New Constructs’ financial database collected using AI and machine learning to extract fundamental data from public companies’ 10-Ks, especially the harder-to-find data hidden in the footnotes and the MD&A sections.

In their study, Professors Rouen, So & Wang show that disclosures of non-operating and less persistent income-statement items have, over time, become both more frequent and economically significant.  Adjusting GAAP earnings to exclude these items creates a significantly more accurate measure of core earnings that is free of management and street analysts’ bias, highly persistent, and that more accurately forecasts future company financial performance.

In addition, the study found that analysts and other market participants are slow to discover and adjust for the impact of different types of earnings, expenses, gains or losses in their own analyses.  This fact has created market inefficiencies due to a fundamental misunderstanding of company earnings.

The main reason behind this market inefficiency is that, prior to New Constructs creating its database, there was no consistent source for market participants to collect the unusual items distorting existing measures of core earnings.

“To further explore Compustat’s treatment of non-recurring items that appear on the income statement, we examined a random sample of 30 firm-years that reported economically meaningful items on their income statements to determine if and where Compustat reported these items. In all instances, NC identified the items as non-operating, and Core Earnings includes adjustments for these items. In 10 of the firm-years, the item was not reported in any Compustat variable; the other 20 items were reported in 13 different variables.” – page 14, 1st paragraph, 4th-7th sentences

The authors repeatedly refer to the comprehensiveness, accuracy and scale of the New Constructs database as offering researchers and investors a new landscape for analyzing accounting data and its relationship with stock price performance.

“…the NC dataset provides a novel opportunity to study the properties of non-operating items disclosed in 10-Ks, and to examine the extent to which the market impounds their implications.” – page 19, 2nd paragraph, last sentence

The authors also show a trading strategy that exploits the cross-sectional differences between firms that have the highest amount of Total Adjustments to firms that have the lowest amount of Total Adjustments, it would produce abnormal stock returns of 7-to-10% per year.

Our Take

The new study, Core Earnings: New Data and Evidence, published by professors Rouen, So & Wang states that the implications of their findings on “core earnings” are “potentially far-reaching for investors and for researchers”.  In itself, this study should be of interest to analysts and investors.

However, the researchers’ determination that most traditional measures of core earnings, including First Call’s Street Earnings, IBES’ non-GAAP income, or Compustat’s income before extraordinary items have not been terribly reliable or predictive is another key result of this study.  The authors even state that “…many of the income-statement-relevant quantitative disclosures collected by NC [New Constructs] do not appear to be easily identifiable in Compustat…”  In other words, Rouen, Wang & So found that New Constructs’ measure of core earnings to be the most comprehensive, reliable, and predictive dataset when it comes to measuring companies’ core earnings – all reasons why they used the New Constructs data in their latest study.

It is also interesting to note that New Constructs’ management has confirmed its desire to make its database as affordable as possible for first movers.  David Trainer, CEO and founder of New Constructs explained, “New Constructs currently offers our database via API at a significant discount to the cost of traditional offerings.  The firm’s strategy is to continue to exploit the cost advantage created by the proprietary technology we created to parse financial filings (for private and public firms) more efficiently.”



About Author

Mike Mayhew is one of the leading experts on the investment research industry. In addition to founding Integrity Research, Mike is on the board of directors of Investorside Research Association, the non-profit trade association for the independent research industry, and a frequent speaker on research industry trends and developments. Mike has over thirty years of research industry experience. Email:

Leave A Reply