The following is a guest article written by James Moran. President/Co-Founder of YipitData, which analyzes web data to provide KPI estimates and answer key questions for investors.
Integrating data analysts and engineers in the investment process is a major challenge for institutional investors looking to leverage alternative data. We face the same challenge here at YipitData: our 50+ engineers, data analysts, and research analysts have to work together to refine terabytes of raw data into KPI estimates and insights for institutional investors.
Take, for example, this analysis of competition between Expedia and Priceline, which concludes that even though both companies have a similar level of inventory, Priceline is essentially unchallenged in the crucial European battleground in terms of actual booking volume:
What makes this analysis so hard to build? You need engineers to build the systems that collect the raw data, data analysts to QA and structure it, and research analysts to figure out what you’re trying to answer. But then you’ll learn the original goal was impossible and have to iterate again and again through various constraints along the way.
Over the past 7 years, we’ve developed an organization and recruiting process that enables us to face this challenge. Here have been the keys for us:
These aren’t separate functions. A common mistake funds make is hiring the best engineers and the best data scientists and silo-ing them from investment professionals. That would be the equivalent of building a separate group to meet with management teams – the function should be integrated as closely as possible to the investment decision. At YipitData, we like to say that everyone should be able to do each other’s jobs.
Investors should learn the basics of Python and MySQL. Knowing querying languages is like knowing accounting. Everyone in the organization needs to speak it, and any smart person can learn enough of the basics so they’re at least speaking the right language. Learn Python the Hard Way and Learn SQL the Hard Way are great introductory resources for every team, and you can expand from there based on the specific challenges of the data you want to use.
Engineers and data analysts should learn the basics of fundamental investing. Have your Research Analysts lead training on valuation, fundamental analysis, and the key questions that drive your strategy. Once your technical team members know the actual questions that need to be answered they’ll surprise you with insights you never thought possible.
Develop a data infrastructure that all teams can use. Learning the investor use-case will enable engineers and data analysts to design the ideal infrastructure to ingest, store and query datasets, and since your research analysts have learned querying languages, they can work directly off the tables to set alerts or export query results to Excel. It’s more straightforward for a research analyst to ask a data analyst to construct a new column than a completely finished analysis (which never quite turns out the way that’s expected).
Everyone should be accountable for the outcome. It’s tough to define the right goal for “data analysis” and the right goal for “engineering.” Research analysts should share accountability for the timeframe and risks of each data project. Overall, it’s better to have everyone work together on the shared goal of developing the best possible insights.
Technical Recruiting Process
You probably don’t need hire a PhD. Realistically, this work is not on the bleeding edge of statistical modeling.
Favor some interest or background in finance. It ensures more purpose for the mission at hand.
Don’t go with your gut. You probably can’t judge this sort of talent based upon your personal feel in an interview. Focus instead on objective signals of past success then as quickly as possible, provide the candidate an assignment.
The take home exercise is key. Your process will succeed or fail based on your ability to design a prompt that aligns with the skillset required for the role. Provide an actual dataset with a specific deliverable. Don’t compromise. Don’t rationalize. The right candidate will return something that really impresses.
You’re not that great of a place to work. Data scientists are stars in basically any other organization outside of finance. As a money manager, you will face significant skepticism from candidates that they will be a second-class citizen. Also, your data problems aren’t particularly sexy – they probably don’t require machine learning or artificial intelligence. Prove to the best candidates they will be a core contributor to the team, otherwise their process will be over before it begins.
Why It’s Worth The Effort
Alternative data is a seismic shift in the investment process on the scale of management access and expert networks. Funds need to develop new capabilities to make the most of it.
But the opportunity may be greater. Because this data is granular, real-time and cross-company, it can enable investors to understand the short-term and long-term business drivers better than experts and even the management teams themselves.