Open5G is becoming Hexvarium. We look forward to the next chapter in our relationship with you. Stay tuned for more details.
Seeing the Forest From the Trees
By Fabion Kauker
Chief Information Officer, Hexvarium
Most recently ETH Zurich published their “Global Canopy Height 2020” data set along with the view in Google EarthEngine. This is a great potential resource for broadband deployments and we are going to explore how we can access the data and utilize it to influence how we think about the cost and timeline of a project. Often permits place a high level of uncertainty onto the construction of potential cable routes. What if we could identify these risks before they happen? Then even a step further, could we use this to influence where we deploy and when?
To do this we need to get the data, transform, store and query it. Luckily we have the tools and data models for this. As well as supporting many other use cases along the broadband network construction process.
As you can probably tell this approach is going to involve Hexagons! We are using H3, the library which was developed to solve the global scale challenges at Uber and has now seen adoption by the broader community.
Mark Litwintschik took us through how to do this using data processing tools like gdal2xyz, and ClickHouse. We have now adopted this approach and are taking it a step further by utilizing the unfolded.ai Hex Tile API and Studio the interactive analytics platform.
Once we have the data in unfolded we are able to publish maps which can be shared across our organization and with external users.
You can look at this example here:
Now this is just one data set of 100’s we are examining and combining to build the most detailed US national map of Cost, Revenue and Competition for the broadband market. We then combine these data with optimization models (Thanks LocalSolver!) to find the best opportunities that meet investment directives or bridge the digital divide. Or as we like to put it “how and where to build fiber infrastructure for the highest possible impact and the best possible return.”
For example we can take each unique identifier across the US that we have data for and join it. “89283082893ffff“ is a resolution 9 hexagon in San Francisco. We can then use this to build richer and more complex models. Further we can bring back any data we gather in the field to update our predictions. This then takes us into the temporal domain, we make predictions not just for a location but a location in time!
Everything is as simple as curl or do it in the UI 🙂
Once we have the data as created in Mark’s blog post we need to extract out the resolution we want and then upload it to the unfolded processing capability. This is as simple as two commands.
Then once the data set is loaded it can be converted to Hex Tile.
From here we go to the UI to see if the data is ready to be mapped.
Once the “Hex Tile Status” is “Ready” we can explore the data.
We want to determine what the relative frequency is of tree canopy for a given potential fiber route.
This can be achieved using the data export functionality and some python pandas code.
Step 1 – Extract data
Draw a polygon using the selection tools in Studio
Step 2 – Determine the impact
Now that we have an extract we can bring it down as csv and do some quick charts to see if this route is more at risk from delays due to trees. To do this we will also extract the whole of SF and compare.
So we have two csvs:
- This routes
- All of SF
To verify, let’s make a new map and add them.
Using some Python code we can quickly determine that this route has lower risk of cost impacts from tree canopy or roots.
You can run the code from here.
These findings are shown using the charts below.