Difference between revisions of "Spatial data management using PostGIS"
Alex Fortin (talk | contribs) m (→Managing Data) |
Alex Fortin (talk | contribs) m |
||
Line 27: | Line 27: | ||
'''Windows''' |
'''Windows''' |
||
− | *Installation of PostgreSQL, |
+ | *Installation of PostgreSQL, pgAdmin 4* & PostGIS |
*Uploading Data |
*Uploading Data |
||
*Retrieving Data |
*Retrieving Data |
||
Line 44: | Line 44: | ||
− | The Windows method is the most up to date (2024) using the latest versions (most stable versions) of PostgreSQL (v16.4-1), |
+ | The Windows method is the most up to date (2024) using the latest versions (most stable versions) of PostgreSQL (v16.4-1), pgAdmin 4 (v8), PostGIS (v3.4.2) & QGIS (v3.34.11). |
− | Ubuntu Linux which we access through OSGeoLive is unfortunately using older software without the possibility to bring up to date (Example: PostGIS 3.3.3, QGIS 3.28.5, |
+ | Ubuntu Linux which we access through OSGeoLive is unfortunately using older software without the possibility to bring up to date (Example: PostGIS 3.3.3, QGIS 3.28.5, pgAdmin 3). As such, a focus on the Windows method is suggested. |
Line 450: | Line 450: | ||
*In the terminal enter the following line; |
*In the terminal enter the following line; |
||
::<pre>sudo apt-get install postgresql postgresql-client postgresql-contrib pgadmin3 </pre> |
::<pre>sudo apt-get install postgresql postgresql-client postgresql-contrib pgadmin3 </pre> |
||
− | ::This line will get and install packages for postgresql, the postgresql-client, postgresql-contrib, and |
+ | ::This line will get and install packages for postgresql, the postgresql-client, postgresql-contrib, and pgAdmin 3. The postgresql package is the database itself, the postgresql-client is the client to communicate to the database from the shell(terminal), the postgresql-contrib is a library of additional tools for postgresql, and pgAdmin 3 is a gui interface that will easily allow user interaction with the database. |
Line 662: | Line 662: | ||
::In this example I will use SRID: 92149 which is NAD83 UTM Zone 18N |
::In this example I will use SRID: 92149 which is NAD83 UTM Zone 18N |
||
− | ::NOTE; There is a possibility that the SRID may not be in the Spatial_ref_sys table. If that is the case go to http://spatialreference.org/ and find the projection, and there should be an option labeled “PostGIS spatial_ref_sys INSERT statement” click on that link, copy the insert command, and execute the command in the SQL query browser in |
+ | ::NOTE; There is a possibility that the SRID may not be in the Spatial_ref_sys table. If that is the case go to http://spatialreference.org/ and find the projection, and there should be an option labeled “PostGIS spatial_ref_sys INSERT statement” click on that link, copy the insert command, and execute the command in the SQL query browser in pgAdmin 3. |
− | ::In our case it will be necessary to execute the following line to add the spatial reference, which will be done within |
+ | ::In our case it will be necessary to execute the following line to add the spatial reference, which will be done within pgAdmin 3 in the SQL query editor before adding the data. |
<pre> |
<pre> |
||
Line 720: | Line 720: | ||
− | NOTE Before Starting: Open Jump DataStore Query is finicky with names, ensure that all names are lower case for the schema, table, and column names. You may need to modify the table or schema name for the data you want. This is done within |
+ | NOTE Before Starting: Open Jump DataStore Query is finicky with names, ensure that all names are lower case for the schema, table, and column names. You may need to modify the table or schema name for the data you want. This is done within pgAdmin 3; |
:*Right-click on the schema or the table you want to modify. |
:*Right-click on the schema or the table you want to modify. |
||
Line 818: | Line 818: | ||
==== Advance Functionality performed Using Pgadmin3 ==== |
==== Advance Functionality performed Using Pgadmin3 ==== |
||
− | In |
+ | In pgAdmin 3 using the sql query builder, queries can ran on the data to get characteristics of the data without having the visualize the data. |
*In this first query, the area of a polygon will be returned. |
*In this first query, the area of a polygon will be returned. |
||
− | ::Enter the following command in the |
+ | ::Enter the following command in the pgAdmin 3 SQL query tool. |
::<pre>SELECT ST_AREA(the_geom) As "Area of Extent",* FROM watershed.extent;</pre> |
::<pre>SELECT ST_AREA(the_geom) As "Area of Extent",* FROM watershed.extent;</pre> |
||
Line 835: | Line 835: | ||
− | ::Enter the following command in the |
+ | ::Enter the following command in the pgAdmin 3 SQL query tool. |
::<pre>SELECT ST_Within(waterbodies.the_geom, extent.the_geom) As "Is the water boby within this extent" FROM watershed.extent, watershed.waterbodies;</pre> |
::<pre>SELECT ST_Within(waterbodies.the_geom, extent.the_geom) As "Is the water boby within this extent" FROM watershed.extent, watershed.waterbodies;</pre> |
||
Line 841: | Line 841: | ||
::This will create a list of all the waterbodies records and indicate whether true or false, the waterbody is within the extent. |
::This will create a list of all the waterbodies records and indicate whether true or false, the waterbody is within the extent. |
||
− | *In this last example of using |
+ | *In this last example of using pgAdmin 3, the distance between two features will measured using the ST_Distance function. |
− | ""Enter the following command in the |
+ | ""Enter the following command in the pgAdmin 3 SQL query tool. |
<pre> |
<pre> |
Revision as of 19:18, 4 October 2024
Contents
- 1 Introduction
- 2 About the Tutorial
- 3 Data
- 4 Windows
- 5 Ubuntu Linux
- 6 Conclusion
- 7 References
Introduction
In a postindustrial society, data is undeniably the most important asset that an organization possesses that allows it to conduct operations on a daily basis. Data is the foundation of many outcomes for tasks conducted in organizations that often enhance decision-making. The gravity of data is not solely confined to large organizations conducting business, data is also an important asset for research organizations, for the scholar, and for the home business entrepreneur, et cetera. In a postindustrial society which has put an emphasis on gathering and analyzing data, there has been over the years a multitude of innovations specializing in the discipline of data management.
In this tutorial, data management will be introduced for a specific data type known as spatial data. Spatial data, unlike regular data or other forms of data, has the purpose of representing the geographic location of features & boundaries using coordinates and topology.( Webopedia - Spatial Data )[1]. Furthermore, spatial data requires its own stream of technology for data management, which distinguishes it even more from other forms of data.
The intent of this tutorial will be to demonstrate the basic aspects of the discipline of data management specific to spatial data within a spatial database, followed by demonstrating what can be achieved using a spatial database in terms of advanced functionality. In addition to the primary intent, the entire tutorial will be conducted utilizing FOSS4G software and open source software; FOSS4G is an acronym for Free and Open Source Software for Geospatial. The benefit of giving a tutorial completely in open source software is that anyone can participate, because the software is free of charge. Here is a list of the open source software that will be utilized in this tutorial:
- Ubuntu
- PostgreSQL
- pgAdmin
- PostGIS
- Quantum GIS
- Open Jump
- QGIS
This tutorial's target audience is individuals with little to no knowledge of database management software, but with some experience in the field of Geomatics.
About the Tutorial
The tutorial will be divided into two sections. These sections will each have four main modules, and an optional module.
The two sections and their respective modules are as such;
Windows
- Installation of PostgreSQL, pgAdmin 4* & PostGIS
- Uploading Data
- Retrieving Data
- Advance Functionality & Data Processing
- PostGIS Raster Functionality (Optional)
Ubuntu Linux
- Installation of PostgreSQL & PostGIS
- Uploading Data
- Retrieving Data
- Advance Functionality & Data Processing
- PostGIS Raster Functionality (Optional)
The tutorial will first pass through the Windows method followed by Linux.
The Windows method is the most up to date (2024) using the latest versions (most stable versions) of PostgreSQL (v16.4-1), pgAdmin 4 (v8), PostGIS (v3.4.2) & QGIS (v3.34.11).
Ubuntu Linux which we access through OSGeoLive is unfortunately using older software without the possibility to bring up to date (Example: PostGIS 3.3.3, QGIS 3.28.5, pgAdmin 3). As such, a focus on the Windows method is suggested.
Within the Linux version commands will often be used for execution of procedures. If these commands and their functionalities are unclear, please refer to the external links section which will direct you to a command help page. You can also write man in the terminal followed by the command to get an instant help page.
Example of using man (manual):
man sudo
Data
The original data for this tutorial was acquire from geobase.ca for the Ottawa region. Unfortunately, this site is no longer accessible but its collections are available on open.canada.ca.
For this reason new up to data will be used in the windows method. For the Linux method similar data should be used but the original data will be shown throughout the tutorial.
To find data for the region you want you can look within this .kmz file here or within this index -> (Index). Once the region is selected you can download the shapefile and extract the files you want from here .
Windows Data
In this section, data was acquire from Open Canada for the Kapuskasing region. For simplicity, it is suggested to download identical or similar data from Open Canada.
This data can be found in the National Hydro Network and can be downloaded here (Kapuskasing Data only).
In the dataset that was downloaded, only three files were used;
- NHN_04LF000_2_0_HN_NLFLOW_1 which was renamed streams
- NHN_04LF000_2_0_HD_WATERBODY_2 which was renamed to waterbodies
- NHN_04LF000_2_1_WORKUNIT_LIMIT_2 which was renamed to extent
Raster Data was also used in the form of a Carleton University Orthophoto which can be found here.
- 2019 orthophotos - Ontario Ministry of Natural Resources (16cm) - Carleton University_1 which was renamed carleton
Linux Data
In this tutorial, all the data was acquired from geobase.ca for the Ottawa region. For simplicity, it is suggested to download identical or similar data from Geobase.
The data utilized in this tutorial, can be found at www.geobase.ca, under National Hydro Network by simply performing a search for "02LH000", and downloading the ESRI Shapefile. In order to download data from Geobase, it is necessary to have an account, but the accounts are freely available.
In the dataset that was downloaded, only three files were used;
- NHN_02LH000_1_2_HN_NLFLOW_1 which was renamed to streams
- NHN_02LH000_1_2_HD_WATERBODY_2 which was renamed to waterbodies
- NHN_02LH000_1_2_WORKUNIT_LIMIT_2 which was renamed to extent
Windows
Installation
In this section we will show you how to install PostgreSQL, pgAdmin 4, and PostGIS on Windows.
PostgreSQL
1) Download PostgreSQL Installer for Windows
- Go to the download page of PostgreSQL installers on the EnterpriseDB.
- Download the PostgreSQL for Windows x86-64 version 16.0 (or the latest version (Version 17 is not yet compatible with PostGIS)).
- Download the PostgreSQL for Windows x86-64 version 16.0 (or the latest version (Version 17 is not yet compatible with PostGIS)).
2) Install PostgreSQL on Windows step-by-step
- For this you need to have administrator privileges.
- Double-click the installer file and an installation wizard will appear and guide you through multiple steps. Hence, you can choose various options based on your preferences for PostgreSQL.
- Click the Next button.
- You should see the following
- Specify the installation directory by either choosing your preferred location or using the default folder suggested by the PostgreSQL installer and click the Next button
- Select software components to install:
- PostgreSQL Server option allows you to install the PostgreSQL database server.
- PostgreSQL Server option allows you to install the PostgreSQL database server.
- pgAdmin 4 option allows you to install the PostgreSQL database GUI management tool.
- pgAdmin 4 option allows you to install the PostgreSQL database GUI management tool.
- Stack Builder provides a GUI that allows you to download and install drivers that work with PostgreSQL.
- Stack Builder provides a GUI that allows you to download and install drivers that work with PostgreSQL.
- Command Line Tools option allows you to install command-line tools such as psql, pg_restore, and so on. These tools allow you to interact with the PostgreSQL database server using the command-line interface.
- Command Line Tools option allows you to install command-line tools such as psql, pg_restore, and so on. These tools allow you to interact with the PostgreSQL database server using the command-line interface.
- In this tutorial we will be using Stack Builder to install PostGIS. As such, make sure it is installed. At this stage you can also install pgAdmin 4 which will be used to manage the database.
- Choose the database directory to store the data, or accept the default directory.
- Click the Next button to proceed to the next step:
- Enter the password for the database superuser (postgres)
- After entering the password, retype for confirmation, and then click the Next button:
- Specify a port number on which the PostgreSQL database server will listen. The default port of PostgreSQL is 5432. Ensure that no other applications are using this port.
- Select the default locale for the PostgreSQL server. If you leave it as the default, PostgreSQL will use the operating system locale.
- Afterward, click the Next button.
- The setup wizard will show the summary PostgreSQL information. Review the details, and if everything is correct, click the Next button. Otherwise, click the Back button to adjust the configuration accordingly.
- Now, you are ready to install PostgreSQL on your computer.
- Click the Next button to initiate PostgreSQL installation.
- The installation may take a few minutes to complete.
- Once completed click the Finish button to complete the PostgreSQL installation.
PgAdmin *Optional
pgAdmin4 should have been installed when installing PostgreSQL. If you unchecked the box this is how you can install it separately.
pgAdmin is the leading open source management tool for PostgreSQL, the world’s most advanced open source database. pgAdmin 4 is designed to meet the needs of both novice and experienced Postgres users alike, providing a powerful graphical interface that simplifies the creation, maintenance, and use of database objects.(1)
PgAdmin is available for multiple platforms (PgAdmin Download Page)
- In this tutorial we will use it for Windows.
1) Download PgAdmin
- Go to the Windows download page. (Windows Download Page)
- Download pgAdmin 4 v8.12 (or the latest version).[PgAdmin4 v8.12-x64 Direct Download]
- Download pgAdmin 4 v8.12 (or the latest version).[PgAdmin4 v8.12-x64 Direct Download]
2) Install PgAdmin 4 on Windows step-by-step
- For this you need to have administrator privileges.
- Double-click the installer file and an installation wizard will appear and guide you through multiple steps.
- Click the Next button.
- You should see the following
- Read Through the license agreement and select "I accept the agreement" before clicking the Next button.
- Specify the installation directory by either choosing your preferred location or using the default folder suggested by the PgAdmin 4installer and click the Next button.
- In our case we will continue using the default folder.
- Select if you would like the program to create a shortcut within the start menu folder.
- In our case we do want to create a Start Menu Folder.
- The setup wizard will show the summary PgAdmin 4 information. Review the details, and if everything is correct, click the Next button to Install. Otherwise, click the Back button to adjust the configuration accordingly.
- The installation may take a few minutes to complete. Once completed click the Finish button to complete the pgAdmin 4 installation.
PostGIS
PostGIS is a spatial database extender for the open source PostgreSQL database.( Obe, Regina O. & Leo S. Hsu. 2010. PostGIS in Action. Manning Publishing.)[2]. This extender is what will grant the capability to manage spatial data within the PostgreSQL database.
To install PostGIS on Windows we will use the Stack Builder Application which was installed during the PostgreSQL segment.
- Open Stack Builder and an installation wizard will appear and guide you through multiple steps. Hence, you can choose various options within Stack Builder based on your preferences (It is suggested to follow the same steps shown bellow to not run into any issues).
- Begin by selecting PostgreSQL 16(Or the version you want) from the drop down menu, followed by clicking the Next button.
- This next page will display a variety of applications of which you can install. In our case go to the Spatial Extensions drop down menu and select PostGIS 3.4 Bundle for PostgreSQL 16. Then Click Next.
- A page displaying the selected packages will appear. Make sure PostGIS is selected and chose the location in which you want it to be downloaded. In our case the downloads folder was chosen.
- Clicking the Next button should prompt the installation of said packages (Should only take a few seconds).
- A page displaying that the files have been successfully downloaded should appear. Click Next.
- A new page should have appeared. This is the PostGIS Bundle for PostgreSQL setup page. First read through the license agreement and select I Agree.
- In this page we will select Create spatial database. Then click Next. This will setup our spatial database for later.
- You can now choose in which location you would like it to be installed. In our case the destination will be set to the PostgreSQL 16 program file as seen bellow.
- A page to connect your database will now appear. On this page you will insert your username and password (This is the password you chose when setting up PostgreSQL earlier).
- You can now enter your database name. In this case we will keep what they have put. Clicking Next will begin the installation process.
- Once the install is complete you should see the page below. You can click Finish to complete the process.
Now that the PostgreSQL database has been installed and extended with the PostGIS extension. It is now possible to begin using PostGIS to manage data.
Uploading Data
Create a Spatial Database
The first part of storing data is to create a database to house it. In this section we will create a database through pgAdmin 4 which we installed previously.
- To begin we will open the application by selecting start and searching for pgAdmin 4. It should appear as seen bellow. Click on it to open the application.
- After opening the application press on the server drop down and select the PostgreSQL 16 server.
- This will prompt you to enter a password. This is the same password that we created during the PostgreSQL installation process. (You should save password it. This will prevent you from needing to re-enter it.)
- Once connected to the server press the Databases drop down menu. In this menu we can observer our sample database from the PostGIS installation is visible. This shows us that PostGIS was installed and is working. Pressing on it will connect the database. (We will not use this database. In the next step we will create our own.)
- To create our own database we will right click on Databases and select Create Database.
- Next we will name out database. I named it env_can_db but you can name is whatever you like. After entering the name press save.
This will conclude the creation of the Database.
Managing Data
Before uploading data it is useful to create schemas to logically store data. Once again using pgAdmin 4 we will create a connection to the database created. This will allow us to access the database from within various applications later on.
- To do this right click on Login/Group Roles and press Create.
- Just like the database you can name this user anything you would like. I named it main_user
- Then go to the Defenition page and add a password. I used the same password as we've been using.
- Following this go to the Priveleges page and make this user a "Superuser" and "Can login". Once this is complete press Save.
- Now that our login role/user is created we must grant it privileges onto the database. To do this right click our database (env_can_db) and select Query tool.
- Once on the Query page we will enter a query to grant the privileges needed. As well as two other queries to add spatial extensions to our database making it a Spatial Database. Following each query press execute script' as shown in the following image.
- This first query grants the privileges needed to our user:
grant all privileges on database env_can_db to main_user;
- After running the first query and confiming it worked. Remove it and enter the following query. This adds the PostGIS extension to the database:
create extension postgis;
- This final query adds the PostGIS_topology extension:
create extension postgis_topology;
Concluding these steps our database is now capable of storing spatial data.
Uploading Data in QGIS
The following procedure will be to upload data. There are multiple ways to go about uploading data into PostgreSQL/PostGIS.
- QGIS
- shp2pgsql
- shp2pgsql-gui
- & more...
In this tutorial we will demonstrate how to upload the data using QGIS. (For this portion it is assumed the user has already downloaded QGIS. If this is not the case a step-by-step video guide can be found here)
Once QGIS is installed and open, Start a new project and load some vector layers.
The layers that I've loaded are as such and have been renamed (Can be found in the Data section):
- NHN_04LF000_2_0_HN_NLFLOW_1 which was renamed streams
- NHN_04LF000_2_0_HD_WATERBODY_2 which was renamed to waterbodies
- NHN_04LF000_2_1_WORKUNIT_LIMIT_2 which was renamed to extent
At this point it would be important to define a spatial reference for the data being uploaded.
In this example I will use WGS84/UTM zone 17N (EPSG:32617).
We will now connect our PostgreSQL Database to QGIS. To do this right click PostgreSQL in the Browser tab and select New Connection.
When creating a connection with PostGIS there are a few steps you need to take to make sure it funcitons properly
- In the name field, enter a name for the connections
- In the host enter localhost
- In the port make sure the port matches your database (5433)
- In the database field enter the name of the database
- Under the Basic Tab enter the name of the username
- In the password, enter the password for that username
- Click on test connection. If everything is good, proceed by clicking on ok.
Once this link is made you can now double click it and enter the username and password again to connect.
After connecting we will go to the DB Manager which is found under the Database tab.
Here we will be able to use the Import Layer/File feature to import our layers to our Database. We will import them to the public schema but you can create a new schema in pgAdmin to store them in if you prefer.
- Start by selecting your input.
- Select the same options as seen below and make sure the SRID's are correct.
- Then proceed by clicking on ok.
Once this is completed you can repeat the process for other Vector Layers. Storing them within you Spatial Database!
Retrieving Spatial Data
The data is now located in your spatial database, and is ready to be accessed and retrieved. in this section we will retrieve the data using the GUI in QGIS.
Retrieving Spatial Data in QGIS
Retrieving Data in QGIS from PostgreSQL/PostGIS is very simple.
- In the Browser menu got to PostgreSQL and access your database. Once in your database locate the Schema in which your data is located and expand it. Here you will see the data you have uploaded, as seen below.
- You right click the data and select add layer to project to add it to your project.
- This can also be done by simply dragging the data onto your project or in the Layers menu.
- After this the data should be displayed.
Advance Functionality in PostGIS
Advance Functionality performed using pgAdmin4
In pgAdmin using the SQL Query Tool, queries can be ran on the data to get characteristics of the data without having the visualize the data.
- In this first query, the area of a polygon will be returned.
- Enter the following command in the pgAdmin 4 SQL Query Tool.
SELECT ST_AREA(geom) As "Area of Extent",* FROM watershed.extent;
- This query returns the area for the polygon as seen in the screen capture below.
- The next query will return the boolean true or false, if a shape is within another shape. In this example I've queried if the waterbodies are all within the extent polygon.
- Enter the following command in the pgAdmin 4 SQL query tool.
SELECT ST_WITHIN(waterbodies.geom, extent.geom) As "Is the water body within this extent" FROM watershed.extent, watershed.waterbodies;
- This will create a list of all the waterbodies records and indicate whether true or false, the waterbody is within the extent.
- In this last example of using pgAdmin 4, the distance between two features will measured using the ST_Distance function.
- Enter the following command in the pgAdmin 4 SQL query tool.
SELECT ST_DISTANCE(waterbodies.geom,streams.geom) AS Distance, * FROM watershed.waterbodies, watershed.streams
- WHERE waterbodies.id = 12 AND streams.id = 35;
- This will generate a column labeled distance that will contain the distance between both features specified in the where clause.
PostGIS Raster
For this section the data used is an orthophoto of Carleton U which was downloaded here: Orthophotos
Installation
To install the PostGIS Raster extension go to the Query Tool page of pgAdmin 4 and enter the following:
create extension postgis_raster;
Uploading Raster Files
Preperation
Before Uploading the raster files we must first create a new environment variable.
- In the start menu type edit the system environment variables and go to the application.
- Select Environment Variables....
- Go to the System variables menu and find Path and select edit.
- in the Edit environment variable page select new and paste > C:\Program Files\PostgreSQL\16\bin
- Then press OK until all screens are closed.
It is also important to gather other information.
- Select the raster file(s) you wish to load.
- Identify the SRID ("projection") of your data. You can find this under the information tab of the layer properties in QGIS. (Ex: EPSG:5433)
- Either identify the target database where you would like to load the data or create a new database.
raster2pgsql
To upload the raster file(s) we will use raster2pgsql which is a raster loader executable that loads GDAL supported raster formats into SQL suitable for loading into a PostGIS raster table.
- Open Command Prompt
- Type raster2pgsql. This will display all the options available when trying to upload a raster to PostGIS.
- In our case follow the example bellow:
C:\Users\User\Desktop> raster2pgsql -s <SRID/ESPG> -I -M -C <Name of Raster> <Name of Schema.Table name(Can be what you want)> | psql -d <Database Name> -h localhost -U <psql User> -p <Port>
This is completed example:
C:\Users\alex4\OneDrive - Carleton University\Desktop> raster2pgsql -s 32618 -I -M -C Carleton.tif topology.carleton | psql -d env_can_db -h localhost -U postgres -p 5433
You should now be able to find you raster under the Tables tab in you database.
Retrieving Raster Image
Retrieving a Raster can be done exactly as shown in retrieving a Vector layer in QGIS.
- In the Browser menu got toe PostgreSQL and access your database. Once in your database locate the Schema in which your data is located and expand it. Here you will see the data you have uploaded, as seen below.
- You right click the data and select add layer to project to add it to your project.
- This can also be done by simply dragging the data onto your project or in the Layers menu.
- After this the data should be displayed.
Ubuntu Linux
Installation
In this section we will show you how to install PostgreSQL and PostGIS on Linux (Ubuntu).
PostgreSQL
The first part of the installation for developing a spatial database, is the installation of PostgreSQL. This will be performed using what is known as the synaptic package manager, but via the terminal. (command line)
- In the Applications menu (top-left), in accessories click terminal. A terminal will open.
- In the terminal enter the following line;
sudo apt-get install postgresql postgresql-client postgresql-contrib pgadmin3
- This line will get and install packages for postgresql, the postgresql-client, postgresql-contrib, and pgAdmin 3. The postgresql package is the database itself, the postgresql-client is the client to communicate to the database from the shell(terminal), the postgresql-contrib is a library of additional tools for postgresql, and pgAdmin 3 is a gui interface that will easily allow user interaction with the database.
- Following the execution of that command, PostgreSQL will be installed on your computer.
- A quick test to ensure the database works, enter the following command in the terminal to temporarily "turn you into" the newly installed postgres system user, so you have control over the overall database:
sudo su postgres
- you should see the following;
- The following procedure will create a user for the PostGreSQL database.
- Enter the following command within the terminal, to create a new database user ( PostgreSQL - Documentation)[2].
createuser -sP [enter username]
- This command will create a new user within the database environment; the -s & P before the name of the new user are parameters that will make the user a superuser and prompt the user for a password to ensure a password is set.
- Enter the following in the postgresql-client to exit the PostgreSQL user's shell prompt, returning you to your own user account.
exit
PostGIS
PostGIS is a spatial database extender for the open source PostgreSQL database.( Obe, Regina O. & Leo S. Hsu. 2010. PostGIS in Action. Manning Publishing.)[3]. This extender is what will grant the capability to manage spatial data within the PostgreSQL database.
- Now its time to install the postGIS extension to PostgreSQL, which is performed using the following line in the terminal;
sudo apt-get install postgresql-8.4-postgis
There should be a similar output to what was seen when postgresql was installed.
- Now we will create a postgis spatial database template, which will be used as a model to create spatial databases. ( GeoLabs)[4]. The template is a quick method of creating a spatial database without having to go through the entire process each time, as seen in the creation of the spatial template.
- If you are not already in the postgres user shell, enter the following
sudo su postgres
- The following command will create a database using the UTF8 character encoding, and built modelled after the default PostgreSQL template that will be utilized as the spatial database template.
- The options are;
- -E specify the character encoding
- -T Use the postgresql template to build the database upon
createdb -E UTF8 -T template0 template_postgis
- The following command will define a new procedural language, which will basically add the plpgsql programming language to the template. ( PostgreSQL 8.4 Documentation)[5].
- -d specifies the db name
createlang -d template_postgis plpgsql
- This line of code will add the SQL procedural language to template_postgis, which will enable the ability to write postgis functions/plpgsql scripts further on. It basically extends the capabilities of the spatial template.
- Now the postGIS SQL routines must be extended to the postgis template that we are creating.
- There are two files that must be added; postgis.sql and spatial_ref_sys.sql.
- The location for both files should be as follows (Note: Ensure the correct version of both postgresql and postgis is used in the file path);
- /usr/share/postgresql/8.4/contrib/postgis-1.5/postgis.sql
- /usr/share/postgresql/8.4/contrib/postgis-1.5/spatial_ref_sys.sql
- Execute the following two commands
psql -d template_postgis -f /usr/share/postgresql/8.4/contrib/postgis-1.5/postgis.sql
psql -d template_postgis -f /usr/share/postgresql/8.4/contrib/postgis-1.5/spatial_ref_sys.sql
- Now we want to allow users to alter the spatial tables. This procedure may not be necessary if the user is only setting-up a postgis DB at home, but it is useful in a professional environment with multiple users.
- Execute the following commands;
psql -d template_postgis -c "GRANT ALL ON geometry_columns TO PUBLIC;"
psql -d template_postgis -c "GRANT ALL ON geography_columns TO PUBLIC;"
psql -d template_postgis -c "GRANT ALL ON spatial_ref_sys TO PUBLIC;"
- Each command will modify the permissions for the geometry_columns, geography_columns, and the spatial_ref_sys table.
- The following command is a simple procedure that will basically clear all the storage sectors in the template to ensure there is no garbage. This basically ensures that the template is completely empty.
psql -d template_postgis -c "VACUUM FULL;"
- Now its time to test the spatially enabled database.
- Execute the following command;
psql -d template_postgis -c "SELECT postgis_full_version();"
- The following should appear on screen, press 'q' to return.
- NOTE: The version of PostGIS indicated in the screen capture probably varies from your screen. I've installed the unstable PostGIS 2.0 package.
- The following two commands should be executed to ensure that the previously created template is marked as a template.
psql -d postgres -c "UPDATE pg_database SET datistemplate='true' WHERE datname='template_postgis';"
- The datistemplate will specify that the db is to used as a template, and can be cloned by users with createdb privileges.
psql -d postgres -c "UPDATE pg_database SET datallowconn='false' WHERE datname='template_postgis';"
- The datallowconn if set to false, will refuse connections made directly to the database. This is a good technique to ensure your template doesn't get populate with unwanted data accidentally. (Notice that the PostGIS testing command above will not work if the datallowconn is set to false)
This Completes the Installation portion of the tutorial.
Uploading Data
Now that the PostgreSQL database has been installed and extended with the PostGIS extension. It is now possible to begin using PostGIS to manage data.
Create a Spatial Database
- The first procedure to storing data is to create a database to house the data. In this step we will create a database using the template created in the previous section.
- Execute the following command to create a spatial database using the spatial template;
createdb -T template_postgis ENV_CAN_DB
- ENV_CAN_DB, is the name of the database, enter whatever you would like as a DB name.
- To see the new database that you've created enter the following:
psql -l
- press the 'q' key once ready to return. Enter the exit command to return the shell prompt back to your normal user account.
- You can also run the test query previously executed on the template_postgis to test the new database.
Managing Data
Before uploading data it would be valuable to create a series of schemas to logical store your data.( Obe, Regina O. & Leo S. Hsu. 2010. PostGIS in Action. Manning Publishing.)[6]. But first it's time to open a GUI interface to communicate with PostgreSQL, which may facilitate further procedures, and help conceptualize the database that up until this moment has only been encountered via command line.
During the initial installation of PosgreSQL, another application was installed called pgAdmin 3. PgAdmi n3 is an application that allows the user to perform database duties via a GUI interface.
- In the command line enter the following (ensure you typed exit to quit the postgresql user shell in the previous section)
pgAdmin 3
- or, using the mouse:
Go to Applications -> Programming -> pgAdmin 3
- Now that the interface is open, you will need to create a connection to your postgresql server.
- Click on the icon that looks like an electrical socket and you will get the following prompt.
- In the first field you will need to specify a name for the server, simply enter Postgresql or whatever you would like.
- In the host field, enter localhost.
- In the post field field you should have '5432' by default. If there are issues with this port number you can open /etc/postgresql/8.4/main/postgresql.conf using the following line
sudo gedit /etc/postgresql/8.4/main/postgresql.conf.
Once the file opens you will see a variable called port which will be assigned the default port number. That is the port number required to connect to the postgresql server.
- In the username field enter the name of the user you created.
- In the password field, enter the password for the username above.
- Press ok
- Now you should have a postgres icon in the dropdown below server on the left-hand side. Click on this to expand the postgres server. Now click on databases, and select the database you created using the template_postgis.
- Before continuing towards uploading data to your spatial database, schemas will be created to logically store our data. It is important to create schemas and not store data in the public schema.
- In pgAdmin 3, right-click on schema and click on “new schema”. The following popup will appear.
- In the new schema form;
- Enter a name for the schema, which is representative of the data that will be stored in that schema.
- Indicate a owner if desired
- press ok.
- In the new schema form;
- The new schema is now created.
Uploading Data in QGIS
The following procedure will be to upload data. There are many different ways to go about uploading data into postgresql/postGIS.
- QGIS
- shp2pgsql
- shp2pgsql-gui
- and more...
In this tutorial we will demonstrate uploading data with QGIS. If QGIS is not installed enter following line;
sudo apt-get install qgis
Then proceed to
applications -> Science -> QGIS
- Once QGIS is open, load some vector layers. After adding vector files;
- click on the plugins menu, and then manage plugins.
- In the list of plugins, ensure SPIT is selected which will enable the import of vector files from QGIS to postgresql/postGIS.
- (In this example I've added a portion of the National Hydro Network for tile 02LH000. (Geobase Data))
- Now click on the elephant icon, which is the SPIT tool, and the following prompt will open.
- Click on 'New' to open a new prompt to connect to the database.
- In the name field, enter a name for the connections
- In the host enter localhost
- In the database field enter the name of the database
- In the username, enter the name of the username
- In the password, enter the password for that username
- Click on test connection. If everything is good, proceed by clicking on ok.
- Once back at the main SPIT interface, in the dropdown select the database connection you want.
- Uncheck the SRID checkbox, it is always important to define a spatial reference ID for the data being uploaded, but for now you may want to leave it at -1 for ease of use.
- In this example I will use SRID: 92149 which is NAD83 UTM Zone 18N
- NOTE; There is a possibility that the SRID may not be in the Spatial_ref_sys table. If that is the case go to http://spatialreference.org/ and find the projection, and there should be an option labeled “PostGIS spatial_ref_sys INSERT statement” click on that link, copy the insert command, and execute the command in the SQL query browser in pgAdmin 3.
- In our case it will be necessary to execute the following line to add the spatial reference, which will be done within pgAdmin 3 in the SQL query editor before adding the data.
INSERT into spatial_ref_sys (srid, auth_name, auth_srid, proj4text, srtext) values ( 92149, 'epsg', 2149, '+proj=utm +zone=18 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs ', 'PROJCS["NAD83(CSRS98) / UTM zone 18N (deprecated)",GEOGCS["NAD83(CSRS98)",DATUM["NAD83_Canadian_Spatial_Reference_System",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6140"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9108"]], AUTHORITY["EPSG","4140"]],UNIT["metre",1,AUTHORITY["EPSG","9001"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996], PARAMETER["false_easting",500000],PARAMETER["false_northing",0],AUTHORITY["EPSG","2149"],AXIS["Easting",EAST],AXIS["Northing",NORTH]]');
Once everything is completed, the spatial data is now stored within your Spatial Database!
Retrieving Spatial Data
The data is now sitting in your first spatial database, and is ready to be accessed and retrieved. This portion of the tutorial will demonstrate retrieving data using both the GUI approach in QGIS and the SQL approach in Open Jump.
Retrieving Spatial Data in Qgis
To retrieve data in QGIS is relatively easy. In QGIS look for the blue icon with a green plus sign that will have the caption “add postgis layer”.
- Click on the “add postgis layer” icon.
- Select the database from the list within the dropdown menu. If the database is missing, repeat the same procedure to add a database connection that was performed in the SPIT procedure.
- Click on Connect
- Select the schema, which will expand and list all the data you've uploaded.
- Select the data you want to retrieve, and press add.
Immediately following the last procedure, the spatial data will now be displayed.
Retrieving Spatial Data in Open Jump
In this portion of the retrieving data portion of the tutorial, Open Jump will be utilized. Using Open Jump will allow the execution of SQL queries to retrieve data. This will grant another perspective for retrieving data.
In the terminal enter the following command to open Open Jump.
Openjump
Depending on your version of Open Jump, you may need to download the latest version required for this tutorial. (Open Jump 1.4.0.2) If that is the case go to http://sourceforge.net/projects/jump-pilot/files/OpenJUMP/1.4.0.2/ to download the linux version which is in a .zip archive. (Likely the case with Ubuntu 10.04)
- Extract the archive
- In the terminal go to the folder /OpenJUMP/bin
- Execute the following command
sudo chmod +x oj_linux.sh
- This will make the file executable, and then enter the following command
sudo ./oj_linux.sh
- Open Jump 1.4.0.2 will open, which has the datastore query tool required for this tutorial.
NOTE Before Starting: Open Jump DataStore Query is finicky with names, ensure that all names are lower case for the schema, table, and column names. You may need to modify the table or schema name for the data you want. This is done within pgAdmin 3;
- Right-click on the schema or the table you want to modify.
- Click on Properties.
- In the name field, reenter the item's name in lower case.
- Click Ok.
Now that Open Jump is open;
- Click on Layer -> Run Datastore Query
- It will be necessary once again to establish a connection to the database.
- Click on database icon, click on add, and complete the required information as in the previous steps for QGIS.
- Once the connection has been established, enter the following command, but with the names of your database, and your data.
SELECT ST_AsBinary(the_geom), * FROM YourSchemaName.YourTableName;
- Here is what I've entered to get the following screen capture.
SELECT ST_AsBinary(the_geom) FROM watershed.extent; SELECT ST_AsBinary(the_geom) FROM watershed.waterbodies; SELECT ST_AsBinary(the_geom) FROM watershed.streams;
- The first portion of the query is the “SELECT” statement, which determines what data columns you want returned with the execution of the query. Using PostGIS there is one important column that must be treated differently, and that's the the_geom column.
- The the_geom column is the column in each spatially enabled table, that stores geographic data for the feature. When calling the column in a query it is necessary to convert the PostGIS geometry to OGC standard binary format for Open Jump.( Obe, Regina O. & Leo S. Hsu. 2010. PostGIS in Action. Manning Publishing.)[7]. Different applications will require different conversions.
- Following the the_geom column conversion, a comma is placed and an * is used to quickly imply that all columns will be returned with the query. This basically means that all columns that you would have in an attribute table will be returned.
- The next portion of the query is the FROM statement which is used to specify the schema and the table where the data can be found.
- The last portion of the SQL query is always the semi-colon, which must be used or an error will be generated.
Advanced Functionality in PostGIS
As discussed earlier in the introduction, data can be processed to derive information to eventually derive knowledge. This need to generate information from data is crucial in Spatial Analysis, and in decision-making. When a spatial database is instantiated and utilized to house all the data for an organization, this database is an incredibly powerful tool that can be used to generate information.
In PostGIS there is a massive collection of various functions that can be called to manipulate, generate and process spatial data. All these functions are listed in the PostGIS documentation found at http://postgis.refractions.net/documentation/manual-1.5/
In this portion of the tutorial, basic advance functionality will be explained to demonstrate the abilities of PostGIS.
Advance Functionality Using Open Jump
- The first query that will be demonstrated is a query similar to the data retrieval query but with a where clause. The where clause is to used specify what data to return.
- In the Open Jump Datastore Query tool, enter the following command;
SELECT ST_AsBinary(the_geom),* FROM watershed.streams where gid >= 1 AND gid <= 500;
- NOTE: Once again, column names that are in uppercase will cause an error in OpenJump.
- In this query, all records from the streams table were returned if the gid column value was between 1 and 500.
- The second query that will be ran is a buffer query.This query is more intresting because it runs a buffer process without having to perform such an operation using a dekstop GIS application.
- To execute a buffer use the following syntax modified for your schema and table names.
SELECT ST_AsBinary(ST_BUFFER(the_geom,0.005)),* FROM watershed.waterbodies where gid = 1045;
- This query will create a buffer region around the waterbody with the gid value 1045.
- The following query will demonstrate how to perform a union between two polygon files. In this query I've specified a range for waterbodies table, because this function is memory consuming and will crash Open Jump.
- In the Open Jump Datastore Query tool, enter the following command;
select st_asbinary(ST_Union(waterbodies.the_geom,extent.the_geom)),* FROM watershed.waterbodies, watershed.extent where waterbodies.gid >= 1 AND waterbodies.gid <= 10;
- If you look in the attribute table of the newly processed polygon, you will see that the union between the extent and the waterbodies has merged the records together.
- This last query in OpenJump will demonstrate how to have a line drawn between two features that is the shortest path.
- In the Open Jump Datastore Query tool, enter the following command;
select st_asbinary(st_shortestline(waterbodies.the_geom,streams.the_geom)), * From watershed.waterbodies, watershed.streams Where waterbodies.gid = 1273 AND streams.gid = 10735;
- As you can see in the image above, a line has been generated displaying the path of the shortest distance between both features.
Advance Functionality performed Using Pgadmin3
In pgAdmin 3 using the sql query builder, queries can ran on the data to get characteristics of the data without having the visualize the data.
- In this first query, the area of a polygon will be returned.
- Enter the following command in the pgAdmin 3 SQL query tool.
SELECT ST_AREA(the_geom) As "Area of Extent",* FROM watershed.extent;
- This query returns the area for the polygon as seen in the screen capture below.
- The next query will return the boolean true or false, if a shape is within another shape. In this example I've queried if the waterbodies are all within the extent polygon.
- Enter the following command in the pgAdmin 3 SQL query tool.
SELECT ST_Within(waterbodies.the_geom, extent.the_geom) As "Is the water boby within this extent" FROM watershed.extent, watershed.waterbodies;
- This will create a list of all the waterbodies records and indicate whether true or false, the waterbody is within the extent.
- In this last example of using pgAdmin 3, the distance between two features will measured using the ST_Distance function.
""Enter the following command in the pgAdmin 3 SQL query tool.
select st_distance(waterbodies.the_geom,streams.the_geom) As Distance, * From watershed.waterbodies, watershed.streams Where waterbodies.gid = 1273 AND streams.gid = 10735;
- This will generate a column labeled distance that will contain the distance between both features specified in the where clause.
PostGIS Raster
The following section is for the adventurous who want to begin testing the PostGIS Raster functionality that is currently in development. The stable version of PostGIS Raster is anticipated to be released in the summer of 2011, in PostGIS 2.0. The following section will require that the unstable version of PostGIS 2.0 be installed from source.
Installation
- Download the latest PostGIS 2.0 snapshot from http://postgis.refractions.net/download/
- Within a terminal go to the folder location of the downloaded package
- In the terminal enter the following; ( PostGIS 2.0 Documentation)[8].
tar xvfz postgis-2.0.0SVN.tar.gz cd postgis-2.0.0SVN ./configure –with-raster make make install
- you will need to create a new spatial database template, with the following lines of code in the postgresql-client. (sudo su postgres)
createdb yourdatabase createlang plpgsql yourdatabase psql -d yourdatabase -f postgis.sql psql -d yourdatabase -f spatial_ref_sys.sql psql -d yourdatabase -f rtpostgis.sql
- The entire process is similar to how the spatial database template was created in the first portion of the tutorial, the only differences are the paths for the files, and the additional “psql -d yourdatabase -f rtpostgis.sql” to extend the raster functionality in the spatial template.
- Once this is complete test the PostGIS spatial template with;
psql -d new_template_postgis -c "SELECT postgis_full_version();"
Upload Raster File
- Once the new template is good to go, create a new database, a new schema, and upload your raster file using the raster2pgsql.py script, using the following line of code in the terminal, not postgres-client.
- Format;
Python /path/of/raster2pgsql/script -r /path/of/raster/image/image.tif -t schema.tablename -l 1 -k 200x200(size of tiles) -o sql_script_generated.sql -s SRID -I -M
- This is the script that I've utilized to generate a script that will upload the raster file.
python /home/jsbouffa/Downloads/postgis-2.0.0SVN/raster/scripts/python/raster2pgsql.py -r /home/jsbouffa/Downloads/aerial/S4_07550_4536_20060830_m20_1_utm18.tif -t raster.rastertiled -l 1 -k 200x200 -o upload_dem.sql -s 92149 -I -M
- Now open a postgres-client session in terminal and enter the following line;
- Format;
psql -d databaseName -f nameOfGeneratedSQLFile.sql
- This is what I executed;
psql -d ENV_CAN_DB -f upload_dem.sql
- Now the script has been uploaded with the database.
Retrieve Raster Image
Retrieving the data is not as many might think within Open Jump. Open Jump will returned a polygon representing the raster with values. To my knowledge the only method possible for displaying the actual raster as an image format will need to utilize the GDAL PostGIS raster tools, which are currently available in Mapserver, but not in Open Jump. ( PostGIS Raster Frequently Asked Questions)[9].
The following command will only display the extent of the raster;
select ST_AsBinary(rast::geometry), * FROM raster.rastertiled;
The rast::geometry type casts the raster into a polygon to display the extent.
This concludes the portion for PostGIS Raster.
Conclusion
In this tutorial the value of data management was not only demonstrated for data storage purposes to adequately manage data, but it also demonstrated the additional capabilities within PostGIS that uses the stored data to produce valuable information. In this tutorial the reader should now have a basic understanding of the following;
- Installing PostgreSQL / PostGIS
- Managing & Uploading Data
- Retrieving Data
- Advance Functionality in PostGIS
- and Basic PostGIS Raster Capabilities
Contributions to This Tutorial
This tutorial will hopefully evolve as PostGIS and other software utilized in this tutorial evolve. I would like to promote any additions to this tutorial that may contribute to enhancing the quality of this tutorial. The following is a list of areas that I would specifically encourage contributions;
- PostGIS Raster & Mapserver
- Advanced Functionality in PostGIS
- Bringing up to date the Linux portion of the tutorial by using up to date data and software(if it is possible)
About This Tutorial
This tutorial was created in 2010 for GEOM4008 which is part of the Geomatics program at Carleton University, located in Ottawa, Ontario, Canada.
It was then updated in 2024 to display the functions of Spatial Data Management using PostGIS on WindowsOS.
References
- Spatial Data in Webopedia. 2010. http://www.webopedia.com/TERM/S/spatial_data.html
- PostgreSQL 8.4 Documentation.2009. http://www.postgresql.org/docs/8.4/static/
- Obe, Regina O. & Leo S. Hsu. 2010. PostGIS in Action. Np:Manning Publishing.
- PostGIS Template Database in GeoLabs. 2006. http://geospatial.nomad-labs.com/2006/12/24/postgis-template-database/
- PostGIS Raster Frequently Asked Questions. http://postgis.refractions.net/documentation/manual-svn/RT_FAQ.html
Here is a list of links that are either additional support or other tutorials of interest;
- PostgreSQL. 2024. https://www.postgresql.org/
- pgAdmin. 2024. https://www.pgadmin.org/
- PostGIS. 2024. https://postgis.net/
- QGIS. 2024. https://www.qgis.org/
- Bash Linux Shell Commands - http://ss64.com/bash/
- PostGIS 1.5.2 Manual - http://postgis.refractions.net/documentation/manual-1.5/
- PostGIS WKT Raster Tutorial 1 - http://trac.osgeo.org/postgis/wiki/WKTRasterTutorial01
- Spatial Reference - http://spatialreference.org
- Open Jump - http://www.openjump.org/
- Quantum GIS - http://www.qgis.org/
- GDAL - http://www.gdal.org/
- Ubuntu - http://www.ubuntu.com/
- OSGeo - http://www.osgeo.org/
Data
- National Hydro Network. Open Canada. https://ftp.maps.canada.ca/pub/nrcan_rncan/vector/geobase_nhn_rhn/shp_en/04/
- Carleton Orthophotos. Carleton Library. https://library.carleton.ca/find/gis/geospatial-data/carleton-campus-aerial-images