Alternative Data Cross-functional Teams and Workflow


The following is a guest article from Gene Ekster, who has been involved with alternative data on both the buy and sell side.  This article is the conclusion of a four-part series on alternative data. The previous article, Alternative Data Research Compliance, discussed compliance issues related to using nontraditional data in an institutional investment context.

This last piece in the series addresses the interaction between the alternative data teams and the investment teams within asset managers, including non-obvious benefits of such collaboration. Additionally we touch on the data development workflow from acquisition to application.

Relation to buy-side investment teams

Hedge funds’ and mutual funds’ development of in-house alternative data teams is a relatively new phenomenon aimed at helping funds monetize the opportunities presented by alternative data. Those teams, sometimes called “R&D” or simply “Data Groups” generate value by first acquiring unique datasets and then developing them into insights leveraged by internal portfolio management teams.

Much of the process is highly technical, employing terabyte scale databases, machine learning algorithms and data scientists, a world that appears removed from the day-to-day operations of traditional fundamental investors. Yet, for a fund to derive the full benefit of data, it must integrate the human intelligence of investment managers with the intelligence derived from datasets.

Practically, the integration is realized via a continuous feedback loop between the portfolio management teams and R&D, where the combined experience of data analysts and investment professionals are employed to drive the research process. Unlike some sell-side research models, internal R&D teams should not exist as a one-way operation feeding information upstream to investment teams; doing so would defeat the purpose of building in-house capabilities.

Most data research projects are lengthy and resource intensive, therefore, a potential market investment use-case must be taken into account before committing to a dataset. The frequent interaction between the data talent and the investment professionals allows multiple research directions to be explored simultaneously, some failing quickly and others succeeding spectacularly. Ultimately funds gain the most benefit from employing at least one person to bridge the gap between the traditional investment process and the science of data research.

Incubation programs

If an internal alternative data group is a justifiable direction for a fund, it can derive a non-obvious benefit: A firm-wide uptick in data savviness and an enhanced Portfolio Manager (PM) development program. Alternative data expertise would propagate from the R&D group into the rest of the firm including the investment teams, thereby increasing the fund’s overall data aptitude which is quickly becoming a required competency for all investment professionals.

This can be accomplished via an analyst incubation program, where financial analysts from PM teams spend a portion of their time training inside of the R&D group. Once analysts revert back to their original teams, they transfer with them a wealth of technical skill sets that can add significant value to the group, especially if they end up trading with the same dataset on which they received their apprenticeship.

R&D Process Flow

The typical R&D analytical process is highly technical and a full discussion of the details of is beyond the scope of this article, but a brief highlight of the major steps is as follows:

  1. Acquire and Evaluate Datasets: R&D’s sourcing group contacts potential vendors, assesses compliance, acquires data samples, evaluates ROI, and establishes commercial relationships. A slightly modified version of the same process would apply to internally or externally harvested web data.
  2. Normalization: A broad category which applies to technical aspects of data processing, but not necessarily modeling, including converting unstructured data to structured, cleansing, aggregating. Often R&D must secure datasets just for the purposes of calibration such as publicly available data from the BLS (for instance the BLS Consumer Expenditures Survey), Census, Federal Reserve and other sources. Removing bias is a key step enabling broader insights to be gleaned from a dataset and involves paneling and creating optimized weights to reduce distance to one or several benchmark datasets. Compliance related scrubbing of datasets is also performed in this step.
  3. Modeling: Once the dataset sample is normalized and representative of the population being measured, R&D proceeds to devise models to predict both past and future metrics used in the investment process. Typically these are GAAP and non-GAAP operational metrics such as revenues, margins and other KPIs specific to a particular industry, including unit sales, average prices and revenue per user, subscribers, churn etc. Currently, most alternative data driven investment strategies are fundamental not quantitative, thus modeling securities prices directly from the raw data is not yet a common practice.
  4. Publishing and Distribution: R&D distributes its insights to investing teams via internal products including Excel reports, dashboards, programmatic access to the structured data and qualitative analysis. Well-designed internal distribution systems can help address the thorny issue of attribution of value to the various groups involved in the data driven investment process.

Alternative data research providers

Even if a fund does not have the capital, skillset or simply the strategic need to build an internal group, it can enjoy partial benefits of alternative data without the sizable investment. Funds can turn to intermediaries who are in the business of acquiring third party datasets, analyzing them and selling custom or syndicated research to the buy-side clients. Research offerings from UBS’ Evidence Lab and Eagle Alpha are good examples of alternative data intermediaries, their research enriched by web harvesting blogs, review forums and social media among many other sources. Bloomberg’s Polarlake acquisition is an example of ever larger data focused players joining in the rapidly evolving field of intermediary alternative data providers. Many aspects of the R&D process flow detailed above apply to research sourced from external providers as well as to the raw data.


For a fund wishing to participate in the growing alternative data paradigm, building an in-house R&D team is a competitive advantage, but even having a strategy which addresses data sourcing, internal team collaboration and compliance would already be a key differentiator. However in the near future, as funds consuming unique data sources will no longer be the minority, competing against similarly endowed rivals will be the norm and shops without an alternative data plan may find themselves at a disadvantage.

While not without its challenges, fundamental buy-side shops are moving from relying on speculative instinct to a data driven decision making process, an unstoppable transformation with benefits that will ripple across the investment spectrum.


About Author

Gene Ekster, CFA, was previously head of R&D at Point72 Asset Management (formerly SAC Capital), a Director of Data Product at 1010Data and a Senior Analyst at Majestic Research (now ITG Investment Research). Currently, Gene works with asset management firms and data providers in a consulting capacity to help integrate alternative data into the investment process. He can be reached via LinkedIn

Leave A Reply