The following is a guest article by Sal Restivo, a co-founder of RIXML.org and Chairman of the RIXML Standards Committee.
RIXML, an industry-wide standard for tagging investment research in XML, is about to launch a pilot program to standardize the components of research reports. The new componentization guidelines are designed to make it easier to parse research reports to extract the information of greatest interest.
RIXML.org is a standards body operated by member firms from the buy-side and the sell-side, as well as research aggregation vendors, market data vendors, and technology vendors. The organization created the RIXML schema 15 years ago and it has become widely adopted across the investment research marketplace. The RIXML schema offers research publishers a standard way to tag their reports such that publishers and aggregators can effectively categorize them and improve the research discovery process for their readers.
Now the organization is taking the next step by creating a standard framework for publishers to identify internal components of their reports. Development of this componentization framework has reached the stage where it’s ready for critical evaluation via a pilot program.
Research reports are typically made of several component parts, such as the Investment Thesis, Valuation Method, and Key Assumptions. Each part plays a specific role in communicating the author’s message to the reader. The presence of a key phrase within one component may have a different meaning or significance from the same phrase in another component.
With the whole document as the sole unit of research content, an interested party cannot focus a search operation on any particular parts. Additionally, the archive being searched cannot offer results that directly point to the most relevant parts. Even if the document is well-tagged in an accompanying RIXML file, the most the archive can do is refer the interested party to a qualifying document or set of documents. The document itself is too coarse-grained a unit to support these types of operations effectively.
In order to offer more specific results, a finer-grained unit of content is needed. By providing component-level addressability the RIXML organization expects to improve the precision of tagging and searching. Better precision in the research discovery process leads to higher quality results. And that would be a natural extension of what RIXML already does.
Standard Component Types
The RIXML organization is now developing standardized tagging at the component level, rather than stopping at the document level. We worked with our member firms to analyze a broad sampling of published content. We collected lists of component types from the sample content and collated them. We removed duplicates and esoterics, and pared the list down to what we consider a canonical result.
The resulting list of standard component types will be the basis for component-level addressability. Publishers can use the standard component type labels in the RIXML list, applied to their own content as hidden tags within an HTML5 rendering to communicate context to consumers. Research consumers will then benefit from component addressability across publishers.
Benefits of Componentization
As an example, imagine an investor interested in learning more about analysts’ views on how rising beef costs will affect the short term performance of Shake Shack’s stock price. A typical search using key phrases “shake shack” and “beef costs” may yield some good results – i.e. research reports that have something to say on the subject.
However, if the search space can be focused on components tagged as Summary or Key Investment Drivers, the investor will identify the most relevant insights even more quickly and easily. Component-level addressing pushes higher-quality search results to the top and prunes lower-quality results.
Let’s consider that example a bit differently. Suppose the investor did not focus the search on specific components, but instead made a general query using the same key phrases. Even when the investor does not make use of component-level addressability, the archive can include references to the components containing the key phrases when presenting the search results. This draws the investor’s attention more directly to the most relevant parts of a document, akin to a “light box” that shines a light on those parts while maintaining the overall context.
The RIXML componentization framework offers finer granularity in addressing content, i.e. sort of a higher-resolution approach to querying and finding the best quality results. It gives the interested party the means of expressing more precisely the material sought. And it gives the party serving the results a way of focusing the reader’s attention based on relevance.
Before finalizing and releasing the framework, we plan to pursue a pilot program to illustrate and validate our work. Our guidance documentation lays out the method for applying the standard component type labels and further advises on how they can be put to good use in the discovery process. We will implement component tagging according to our own guidelines against a set of sample research content in HTML5 format. We will then use it to illustrate key use-cases, possibly via an e-seminar.
Once we’ve reached the right level of consensus, we will post the framework materials for general availability. Interested readers can learn more about the RIXML organization and our standards for enriching the investment research discovery process by visiting our web site at RIXML.org.
RIXML.org is a consortium of buy-side financial services firms, sell-side financial services firms, and technology vendors who provide products and services for creating and distributing investment research. The goal of RIXML.org is to improve the process of electronic research distribution by defining an open protocol that will improve the process of categorizing, aggregating, comparing, sorting, searching, and distributing global financial research. The individuals who represent their firms include both IT experts and business-side product managers who represent the analysts, portfolio managers, and others who both produce and consume investment research.