The following guest article was written by Ronan Crosson, Director of Data Strategy & Analytics at Eagle Alpha, an alternative data aggregation platform that also provides supporting advisory services for data buyers and vendors.
ESG is the hottest topic in the data world right now. Regulators and asset owners are putting increasing pressure on funds to invest in a more environmentally friendly and socially responsible way. This provides great opportunity to funds who offer ESG solutions and we have seen massive growth in the AUM directed to funds claiming to incorporate ESG factors into their investment process.
The rise in ESG investing presents some challenges for investors though. The most common challenge we hear about is gaps in the data available, particularly relative to the expectations of regulators and clients.
A closely aligned issue is investor mistrust of company reported measures, seeing them as unreliable, misleading, or incomplete. Said another way, we are seeing a rise in investor attention to company greenwashing. Clients are interested in techniques for identifying whether a company is overstating or simply lying about their ESG credentials.
For most of the market the major ESG rating agencies are the go-to for ESG data and insights. While there is utility in the ratings from these agencies, there are some weaknesses too. The rating agencies are heavily reliant on company disclosed information, can have opaque methodologies, and are generally more policy focused rather than focused on the actions of companies.
For all these reason alternative datasets are increasingly being seen as the source of ESG truth. Alternative data is data external to the rated entity so helps overcome many of the weaknesses highlighted earlier.
The Gaps in ESG Data
Broadly we categorize ESG data gaps into two broad buckets:
- Coverage gaps – where the data is available for some entities and not others
- Granularity gaps – where the level of granularity required isn’t available to any great extent
The more sophisticated funds are using machine learning (ML) to address many coverage gaps. The feasibility and accuracy of this depends on what you are trying to model. For example, ML models are quite good at predicting the rating a major agency might apply to a company in developed markets because there are usually many similar companies with which to train the model. By contrast the models are less strong at predicting CO2 emissions for a company in a developing market, for example, given there is less data to train a model.
Other coverage gaps include ESG data for markets outside of English speaking developed markets, and data for asset classes other than public equities.
Significant time is then committed to identifying data vendors that address these gaps. For example, vendors who provide detailed ESG data in markets such as China, Japan, Korean and India and vendors with coverage of asset classes such as private companies, real estate and infrastructure, and sovereigns.
Natural Language Processing (NLP) solutions are particularly popular for addressing coverage gaps as they provide a scalable way to assess the ESG credentials of an investment based on publicly available data. See figures 2, 3 and 6 later for examples of how NLP can be applied to ESG questions.
Granularity gaps pose an extra challenge to coverage gaps as the data is generally not available for any company so the ability to use gap-filling ML models is severely restricted. Within ESG, the social pillar is rife with granularity gaps.
SASB outlines social sub-topics including customer privacy, data security, labor practices, employee engagement, diversity & inclusion and employee health & safety. Many of these are poorly reported on by companies, if at all.
Figure 1: Female workforce participation at Adobe and Oracle (Source: Revelio Labs)
Employment data is a powerful category for analyzing many of the human capital considerations of social. For example, using online employee profile data to understand diversity and potential discriminatory practices. Figure 1 shows an example of this type of analysis. The analysis clearly shows a disparity between Oracle and Adobe in terms of the females being promoted to senior manager levels.
Figure 2: Product safety and quality issues from SEC 10k filings (Source: Accern)
AI & NLP techniques are also popular for addressing the gaps in social data. The example in figure 2 shows negative key passages related to product safety and quality issues from SEC 10K filings for companies in the Russell 3000. The AI model revealed several issues related to product safety, quality control, quality assurance, and customer product safety that could indicate a hidden trend among certain industries or companies. The insights revealed that product safety issues were most apparent in the consumer goods and manufacturing industries.
Figure 3: ESG risks over time (Source: SESAMm)
Another AI & NLP example in figure 3 involves Wirecard. The analysis detected and alerted the presence of a significant number of fraud discussions and complaints on the web 6-months before these mentions had a material impact on the company’s stock price. This generated a strong quantitative signal. The company’s involvement in the artificial inflation of profits scheme was later more widely discovered and ultimately led to the company becoming insolvent.
In the environmental area, scope 3 emissions data is one example of an area where alternative data can plug an important gap in data availability. Figure 4 below shows the top 5 most emitting companies based on scope 3 emissions. In terms of absolute emissions, the stakeholder for whom the largest amount of GHG emissions were calculated for downstream is Gazprom, the Russian gas giant. Indeed, with 3,574 million tCO2e, its Scope 3 emissions account for nearly 30% of the sample’s Scope 3 emissions.
Figure 4: Scope 3 emissions in billion tCO2e (Source: Carbon4Finance)
There is some overlap between the topic of gaps and that of greenwashing. Some companies may engage in a form of greenwashing by selectively releasing information and deciding not to release other, more damaging, information. In other cases, a company’s statements may simply not match the reality of their ESG credentials. Given the reliance of mainstream ESG rating agencies on company reported information, company greenwashing can result in company ESG ratings which are out of step with reality.
Employment data again proves valuable in identifying potential Greenwashing. Tracking the previous and subsequent roles for sustainability leads at a company can indicate whether a company is genuine in their sustainability efforts. One analysis showed that corporate strategists and communication specialists are the most common previous roles for Chief Green Officers (CGOs), and a very small percentage of CGOs actually held science-based positions previously.
Figure 5: Green Job Postings compared to ESG Ratings (Source: LinkUp)
Another case study compared the ESG scores for the oil majors from a major ESG ratings provider to the levels of sustainability recruitment (figure 5). This could indicate companies whose actions in the environmental areas diverge from their policies and stated targets.
AI and NLP again provides a powerful approach to exploring potential greenwashing. In figure 6 we present SDG scoring for a single company. In one instance the scores are based only on self-reported data whereas the second analysis is based on alternative data from public sources. The difference in the scores is stark, with some SDG scores sifting from strong positive to strong negative depending on which source is used.
Figure 6: SDG Scores from Self-Reported Data and Alternative Data (Source: GlobalAI)
Patent data is another area that shows great promise for identifying potential greenwashing. Patents can provide an indication of investment in green technologies and this can be compared to a company’s claims around their green investments and future plans. Any mismatch could indicate greenwashing.
The ability of alternative data to close gaps in ESG data and spot potential company greenwashing is at an early stage but is rapidly gather momentum. In this article we have only skimmed the surface of what’s possible.
On March 15th Eagle Alpha will host a webinar on this topic featuring BlackRock and UBS AM and we will also publish a white paper on the same topic. In this article we shared some of our initial findings from our research for the whitepaper and webinar.