Skip to content

Mapping the potential of geospatial data

Adam Baron
Adam Baron
Director of Big Data Quantitative Research for StarMine, Refinitiv

GPS information from smartphone apps gives quant investors a new and exciting alternative data source to explore, such as for mapping Same Store Sales. Our advice highlights the best ways to realize the potential of geospatial data in the #SearchForAlpha.


  1. The quant investing opportunities offered by GPS as an alternative data source are expanding rapidly due to the number of apps and device types.
  2. Even if your alternative data signal doesn’t immediately yield alpha, the value may be revealed by looking at it alongside other data sets you already use.
  3. Refinitiv data and the data of our alternative data partners are mapped with PermID to enable connectivity across our entire ecosystem.

At StarMine Research, we develop quantitative finance models and feeds for resale. I work with alternative data and I’ve seen that over the past year or so, geospatial data has been particularly dominant and is showing some really interesting potential.

Most people have dozens of apps installed on their smartphones. When you click “accept”, the fine print usually grants the app permission to use your phone’s GPS and to resell your location data. Many app companies anonymize and sell the data to third parties.

Aggregators that collect the data from app companies sell to hedge funds that want to understand business activity in order to inform their investment strategy. However, the aggregators don’t disclose the specific apps behind the data for privacy reasons.

How to map alternative data

The first step to understanding geospatial research in relation to equities is to map latitude/longitude activity to a business.

This alternative data set is pretty new so the history is much shorter and the company universe is smaller (retail businesses compared to the whole stock market).

Although there is huge variance across vendors in how geospatial data is offered, my first piece of advice is to focus on the data you have.

Focus on the data you have. Mapping the potential of geospatial data

If you’re studying Same Store Sales (SSS), then start with the universe of publicly traded companies (and their subsidiaries) for which you have SSS estimates and actuals. Don’t waste time mapping private companies in the vendor’s data set.

My second piece of advice is to ask yourself if the mapping effort is worth it. If considering different vendors that offer similar data, the nicely mapped content set is an easy first choice.

If, however, you come across an obscure/unique/untapped geospatial content set that is poorly mapped, your mapping effort could be handsomely rewarded by unlocking insights not yet known to the market at large.

A piece in the financial markets puzzle

One goal we’re striving for at Refinitiv is that all our own data and the data of our alternative data partners are mapped with PermID to enable connectivity across our entire ecosystem.

Alternative data is only one piece of the puzzle. Mapping the potential of geospatial data

Geospatial data is just one puzzle piece in the complex portrait of financial markets. While you might be able to detect visits to Walmart, the GPS data won’t give you insight into what’s going on at Walmart.com, if a significant amount of revenue comes from outside the stores.

In our I/B/E/S data set, we have a ton of industry-specific key performance indicators (KPI).

The most relevant KPI for geospatial data is SSS — how much revenue comes from the bricks-and-mortar side of the business. We have both the historical actuals as well as estimates from sell-side analysts as to what those SSS numbers may be in future quarters.

Additionally, there is subsidiary-level SSS data. It’s much more in parity with the geospatial data to study Taco Bell, KFC and Pizza Hut separately rather than as YUM! Brands.

It’s worth a quick reminder that different companies have different fiscal years — ensure your time periods align with exactly what you’re trying to study.

Don’t get discouraged if your alternative data signal alone doesn’t yield alpha; sometimes the value will be revealed by looking at it alongside other data sets you already use.

Making the right comparisons

Compared with traditional financial content sets, alternative data, such as geospatial information, is expanding rapidly due to new apps, new users and new devices.

While it’s certainly a good thing that vendors might cover 50 times more devices in their universe today than they did a few years ago, it sure makes backtesting challenging.

To put footfall and financials in like terms, it’s common to study year-over-year percentages. It may look like all the businesses in your study have significantly more footfall this year compared with last year.

However, if you have many more users/devices in your universe this year, that would explain the increased footfall without necessarily saying anything definitive about business performance.

You could hold the same set of devices common in your year-over-year comparisons so you can observe how that subset of individuals changed behavior over time.

Sometimes this might not work so well because app partners and app loyalty may churn a lot, and hence your static mobile device universe is too small.

Boost your predictive power

Mapping the potential of geospatial data. Normalize the data

Another approach might be normalizing by day in a formula like Visits to Business/Total Universe Visits, so you’re comparing the slice of visits a business receives. But this may be too broad.

I’ve found that Visits to Business/Total Visits Across Industry helps mitigate the expanding universe bias and differentiates between competitors.

While you might occasionally encounter a geospatial data vendor that summarizes activity by business by day, many times it is up to you to make sense of billions of device-level data points.  It might be tempting to consider every data point in your study, but your results might improve if you are careful about what you exclude.

For example, the duration spent at a business matters. Too short and the person may merely be passing by. Too long and it might be a worker rather than a customer.

It’s also worth noting that cities can be quite problematic for geospatial data compared with suburban or rural areas. Consumer-facing businesses in cities tend to have smaller retail footprints, are densely packed next to other business and often reside on one floor of a very tall building.

Once you have your basic quantitative study code in order, experiment with filtering your universe to see how your results change with different exclusion approaches. You may find better predictive power with fewer data points.

The potential of geospatial data

Sometimes less is more. Mapping the potential of geospatial data

One thing I love about working with alternative data is that it feels like a lightly explored frontier. In geospatial data alone, there is so much potential in slicing and dicing and analyzing. You just need to dream up a use case then figure out how to calculate it from the massive amount of data.

It’s worth noting that GPS data is captured from many other devices besides just mobile phones. GPS capability exists in your car, in commercial trucks and in ships and airplanes. You may wear it in your fancy fitness device. Your dog may wear it in her fancy collar.

What I’m most excited about is the potential to study business-to-business activity, such as a commercial truck picking up from a factory and dropping off to several business customers. The potential of geospatial data is just beginning.

For more insight relevant to quant investing visit Destination Quant.

Destination Quant