In our previous blog, we discussed shapefiles and their significance in Geographic Information Systems (GIS). We also highlighted the basic spatial queries used to load shapefiles into PostgreSQL.
This next blog will focus on advanced spatial queries and how they can provide in-depth analysis for addressing various real-time development scenarios.
Scientific or spatial data can be used to support high-performance queries for geo-based insights and other geometrical operations in applications. Conducting high-level spatial queries on these data sets provides quick analysis to support multiple studies and diagnoses. It also helps with strategic planning, reporting and route navigation when combined with pgRouting or another similar database routing approach. An effective spatial query executed over a large volume of data can aid in addressing research challenges and provide insights for optimizing daily operations.
After importing shapefiles into a Postgres DB, a query can be executed by utilizing the spatial data stored in the database.
Steps to import Shapefiles into Postgres DB:
1. The below screenshot shows the opening of the PostGIS Shapefile and DBF load exporter:
- The next step is to enter the database credentials as shown in the screenshot below:
- Here, we open the shapefile:
- As shown in the below screenshot, the table properties are set:
- In this step, the import options are set:
- As shown in this next snippet, successful imports will create the table in the Postgres DB. This open-source database can be used to conduct the high-level queries needed to perform extensive analysis.
To demonstrate the utilization of advanced spatial queries, I am using an open-source GIS dataset for New York City, available on the City of New York website:
- NYC POI open data Link
- NYC Street open data Link
- NYC Parks open data Link
- NYC Zip code Boundaries Link
- NYC Boroughs Boundaries Link
Case 1: We are considering two shapefiles, one containing city and street information and another file containing geo data of places like shops, restaurants, hospitals and other services in the city. Let’s identify all services within a 20-meter range for a given street. The query shown below is then executed to obtain the listing:
Here, the ‘ST_Distancesphere postgis’ function is used to measure the minimum distance in meters between two longitude/latitude points.
Case 2: Let’s now use the New York City street line shapefile, and the New York City park shapefile, which contains geo information for all the parks in New York City. The below query will identify the number of streets intersecting each park and give the output of the highest number in descending order.
Here, we are using the ‘ST_Intersects’ function which returns true if the geometries spatially intersect each other, and returns false if they are disjointed.
Case 3: Here’s a query to identify all the parks within a given zip code boundary.
The ‘ST_Within’ function returns true if geometry A completely lies inside geometry B. Here, we need to ensure that both source geometries must be of the same coordinate projection, having the same Spatial System Reference Identifier (SRID).
Case 4: In this case, the query below is executed to obtain the distance (kilometers) between two given parks. New York City park shapefile data is used for this analysis.
Here, the simple ‘ST_Distance’ function is used, which returns the shortest distance that separates two geometries.
Case 5: New York City is divided into five boroughs. We are using two shapefiles, borough boundary information and street data to determine the sum of the streets’ length in kilometers within each borough. As can be seen in the next image, Queens Borough has the longest street infrastructure.
The above-mentioned use cases demonstrate how advanced spatial queries can be used to solve and analyze real-time problems. They also can be utilized to extract valuable information to be used in various analytical applications.
If your organization needs help analyzing geo-spatial data, contact us today.