We here at Epsilon are always interested in the evolving landscape of passive and passive like investment vehicles.  A few weeks ago, BlackRock, the world’s largest purveyor of ETF investment products, announced the launch of a new series of sector-focused ETFs that put a twist on vanilla passive ETFs.  They seek to reclassify the definition of sectors by using “machine learning, natural language processing and clustering algorithms to create a new classification system based on a company’s related data (e.g., regulatory filings, quarterly earnings report).”  Beyond that cryptic sentence, there isn’t much more in their regulatory filings nor in the press.  As Bloomberg reported, “a BlackRock spokesman declined to comment beyond the information in the regulatory filings.”

We found the idea interesting, because we too try to think outside traditional classification structures when looking at securities.  Doing so provides flexibility and novel perspective for things like portfolio optimization, with the goal of improving outcomes from active share choices.  While active share decision by portfolio managers is obvious (seeking to outperform), differences in classifications by index providers, is at its root philosophical.

A tangible example would be the delineation between technology, media, and telecom (“TMT”) stocks, which are grouped together by many active managers, but are classified quite differently depending on which hierarchy provider you ask.  S&P Global GICS – perhaps the gold standard in industry classification – will break these stocks into three level one categories (Consumer Discretionary, Information Technology, and Telecommunications).  Bloomberg, has another way.  Ditto for Factset.  Why the discrepancy?  One only needs to look at companies such as Apple or Amazon to question how one could be classified one-to-one as Consumer or Tech.

Intrepid data providers have gone deeper than the one-to-one relationship of company to sector and have engaged in techniques such as balance sheet, revenue, and geographic deconstruction in creating more prismatic hierarchical classifications.  After all, how does one think of the geographic footprint of a multi-national corporation as a one-to-one mapping?  The BlackRock line of products are presumably following this thinking by expanding on any one-to-one mapping by fracturing a security to different groupings.  Think Apple geographically defined by its revenue or EBIT by region (e.g., 45% US, 25% Europe, 20% China, etc…).  While source of revenue (or economic value) for geography is a bit more straightforward, industry classification requires something closer to fundamental security analysis in decomposing idiosyncratic business lines.  Enter buzzword technologies to do the heavy lifting (we kid).


To give the example of Intel (demonstrated in the Revere slide above), any research analyst performing a valuation may break down a business like Intel into its core microprocessors, memory, its internet communication, and its newly acquired (uncharted) vision-based analytical suite.  While this all is technically tech, multiples for subsectors (and subindustries) of tech are vastly difference, as are sensitivities to news flow and industry trend (e.g., FPGA vs CPU vs GPU).  For the purposes of an ontological classification, it may be better to divide the company into different sector allocations rather than one naïve industry, or sub-industry.  A company like Amazon can better exemplify this, split between traditional retail, web-services/infrastructure, and hardware.

Of course, BlackRock was in the news sometime ago about cool ML/NLP work as the future of active management.  The news headlines that followed were all about underperformance, for which it is impossible to tell why as a spectator.  Creating quantitative models which seek to find economic sensitivity through fundamental analysis is one thing.  Creating investment signals that seek to profit from differences in classifications is another.  It’s also whimsical to see the world’s largest ETF provider do so, given that ETFs are often maligned for causing these divergences in the first place!

Again, purely on conjecture, perhaps the firm found that fighting the good fight wasn’t an easy way to generate alpha…but to leverage the infrastructure and technology work towards a more passive product?  First, you are providing visibility into your thinking (regarding true classification) to market participants.  You’re also providing channels for capital flow to bring your thesis to reality.  In effect, you’re self-catalyzing while creating alternative ways to monetize this insight beyond active positioning.  Smart thinking!

Now the question is how can advanced techniques help one achieve this purported goal?  We can only guess given the scant disclosure, but if we were forced to do it given a need to create captivating blog content, we’d approach the problem as such:

  1. Use natural language processing to parse public regulatory filings in a way that goes deeper than simple balance sheet xml data;
  2. Use that data to define micro verticals of business activities on a company-by-company basis;
  3. Build a more flexible classification algorithm across all companies that uses resemblance pattern recognition through supervised/unsupervised machine learning;
  4. Test if the behaviors of company stock activity can be synthetically predicted based upon this lens of decomposition;
  5. If historical correlations prove significant, use the output to construct “smarter” indices;
  6. Hopefully profit…?

We await with bated breath greater details on the products.

The information contained on this site was obtained from various sources that Epsilon believes to be reliable, but Epsilon does not guarantee its accuracy or completeness. The information and opinions contained on this site are subject to change without notice.

Neither the information nor any opinion contained on this site constitutes an offer, or a solicitation of an offer, to buy or sell any securities or other financial instruments, including any securities mentioned in any report available on this site.

The information contained on this site has been prepared and circulated for general information only and is not intended to and does not provide a recommendation with respect to any security. The information on this site does not take into account the financial position or particular needs or investment objectives of any individual or entity. Investors must make their own determinations of the appropriateness of an investment strategy and an investment in any particular securities based upon the legal, tax and accounting considerations applicable to such investors and their own investment objectives. Investors are cautioned that statements regarding future prospects may not be realized and that past performance is not necessarily indicative of future performance.