New York – Today, FirstRain President and CEO, Penny Herscher, has contributed an article to the ResearchWatch blog. FirstRain is a primary search-based research firm focused on gleaning information from diverse sources on the Web. The article deals with the fact that the long tail of the information distribution, alongside the plethora of web sites and information sources on the Web, makes it increasingly important to have customized and individualized search algorithms to maximize the value of information available.
We include the article below:
Foster City, CA – Guest Posting: Penny Herscher, President & CEO FirstRain
Today’s business of investment research is increasingly a long tail problem which is very expensive to solve with traditional research approaches; but, which can – like many long tail problems – be solved with search technology. A “long tail” describes business models where products in a market have low frequency across a wide distribution and technology is used to give customers access to relevant products.
The Nature of the Problem
The thesis of the long tail was originally applied to online retail models and examples abound in internet companies, the movie industry, simulation systems, and network traffic to name just a few. In the case of the internet companies – Amazon, eBay, Google and Netflix – these businesses released value in their markets by giving consumers access to products (in these cases books, auction products, general information or movies, respectively) that live in the long tail of the distribution curve of available products.
Consider the similarities to doing qualitative research on a company. The high-frequency information that a researcher needs is straightforward to find (the initial deep part of a Pareto distribution curve of the information) – it’s in news feeds and articles from the mainstream sources of information. Traditionally this would have been combined with old-fashioned research: talking to management teams, trying to find surrounding sources of information and modeling the financial behavior of the company.
But, the world of online information is expanding dramatically and Web 2.0 techniques are driving new, highly distributed outlets for contemporary media. As a result high-quality information can increasingly be found in low frequency sources of information. The World Wide Web is populated by an estimated more than 125 million web sites; this translates into billions of web pages and the number continues to grow exponentially. Non-traditional channels have opened up and the line between “researcher” and “expert” is blurring as ease of publishing combined with the growing appetite for quality information has spearheaded a shift in perception of what is considered authoritative.
Clearly sources must be researched to be reliable, but even when they are the data sources within the long tail are challenging in other ways. The frequency of any one source yielding information is too low to justify daily monitoring-the strength is in the collected stream of meaningful information; however most major search engines have only cataloged ten to twenty percent of the web (as documented in “The Emerging Opportunity in Vertical Search“) and the relevance of the results contain their own long tail.
The cost in time and resources to sift through the information glut is too prohibitive; and yet, investors cannot afford to miss meaningful information that a) impacts their investments and b) that their competitors can see.
How the long tail shows up in practice
Consider blogs as just one example of emerging information resources. Currently, Technorati tracks over 70 million blogs and estimates that 120,000 new blogs are created every day (as reported by Pandia). Like other traditional resources, not all blogs are created equal when it comes to relevant, dependable data; but, there are an ever-increasing number of blogs that are authored by well-respected, credentialed professionals such as Robert Scoble, the original technical evangelist behind Microsoft’s blog, and Derek Lowe, Ph.D. in organic chemistry working for Bayer, to name just two.
In the FirstRain investment research product we process millions of documents from thousands of researched, authoritative sources – looking for relevant matches to our clients areas of interest. In the FirstRain investment research product we process millions of documentsAbout 50% of the documents found are in the head of the curve (after we have filtered out all the obvious junk): news, press releases or near duplicates of the same as news travels through the web – these are then separated out ; the remainder are unique and need to be mined for fit to the client. In that remaining tens of thousands of matching, relevant documents, about 10% of the published count (pushed to our clients) consistently come from blogs confirming that they are a quality source of differentiated information to the investment community.
For example: a portfolio manager that follows Genentech may have seen the posting on Wall Street Journal’s health blog regarding Genentech’s collaboration with Abbott Laboratories as a “hedge against the future of Avastin.” This same manager may have missed the posting on ScienceBlog.com about a faculty member of the Medical College of Georgia and the discovery of a new enzyme that is being rigorously investigated for colorectal cancer treatments-a target Avastin is used for-or another posting on CancerBlog.com about a new warning linking Avastin to TE fistula and patient death.
The solution is a focused, relevant stream of information by client. As was the case with etailers, powerful search technology is the answer to harness the significant qualitative information that is currently underrepresented or underutilized in today’s investment research process. Vertical search for investors-what we call search-driven research-is an up-and-coming field that goes beyond mainstream consumer-focused search to extract relevant information specific to investment professionals. I write about this new field, and running a young company, in my blog.
[Disclaimer: Sandy Bragg of Integrity Research Assoc. is an advisor to FirstRain]