Unsupervised Classification and Polygonization With QGis

From CUOSGwiki
Jump to navigationJump to search

If you have any trouble with any part of this tutorial please view the video walkthrough.

Video Walkthrough

This document is a companion to the video walkthrough available on YouTube!


In some places, extra details that were not covered in the video are added here.


There are lots of reasons for wanting to make a land cover classification. Whether that is to make beautiful maps, a temporal study of land cover or a more complicated statistical analysis. The goal of this tutorial is to be a jumping off point into the profound and complicated world of classifications.

In this tutorial we will be looking at how to perform an unsupervised classification; that is, a classification where the computer program chooses the classes without a training data set.

Programs and Plugins Required

Qgis (3.22 was used in this tutorial) https://qgis.org/en/site/forusers/download.html
7Zip https://www.7-zip.org/
SCP Plugin (download through QGis)

Acquiring Data

For this tutorial pretty much any remote sensing data can be used. Depending on what data it is, it may require pre-processing in order to produce a usable product, and that is outside the scope of this tutorial.

A good resource for free remote sensing data is the USGS Earth explorer. It involves making an account but it is such a good resource for remote sensing data that it is completely worth the extra step.

Sign up here https://ers.cr.usgs.gov/login

Now that you have made an account or signed into your existing account it is time to find the data to use.

You can use pretty much whatever data you see fit but Landsat8 is a very common choice and so that is what's being used in this tutorial. That being said, the first step is to select which data set you want to use. To get to the Landsat 8 image used in this tutorial select Landsat ->Landsat Collection 2 Level-1 ->Landsat 8 OLI/TIRS C2 L1

The next step is to pick the additional criteria. In this tutorial the image used was LC08_L1TP_046007_20200722_20200911_02_T1 this name can be copy and pasted into the "Landsat Product Identifier L1" text box.

Next step is to click results.

The results should bring up one image. Click on the footprint icon to show where it is on the world map (should be in northern Canada).

To download the image click on the download icon, which the one with the little green arrow four icons left of the footprint.

Select the product options right at the top, click on the download button for the entire Landsat 8 Collection 2 Level-1 Product Bundle. It should be a big file (1.13 Gigabytes)

Unzipping Data

The data downloaded from USGS Earth Explorer is in .tar format and most windows machines are unable to extract that type of file without an additional program. You might have to download 7Zip (see Programs and Plugins Required)

To unzip the data simply right click on the file you wish to extract, select 7Zip and select extract here. This will not create a new folder it will simply extract the file into the folder that it is already in.
Once the extraction is done feel free to delete the .tar file as it is no longer needed.

Importing and Displaying Data

To import data, launch QGis and create a new project. Once the project is loaded go to the rater tab at the top:
Raster-> miscellaneous -> Build Virtual Raster. This will open up the build virtual raster tool. Click the thee dots beside the input layers tab.
The input layers are going to be everything that was extracted from the .tar file. To do this use the:
Add Directory button
navigate to the folder where the .tar file was extracted and click:
Select Folder button
This will bring up everything from that file but not everything is needed. De-select every layer that doesn't end in B followed by a number.
The only layers that should be selected end with:
Each file represents a different wavelength that the sensor collected. Landsat 8 uses a multispectral sensor so it collects the light spectrum across 11 wave lengths from blue to infrared. For instance, B2 is blue, B3 is green, B4 is red, B4 is NIR and B11 is thermal infrared. You can use the various bands to your advantage, for instance with band 6 which is a type of infrared, water absorbs so much of that band that it appears black. This could be really helpful for identifying how much water is in an area. Check out the USGS website for more information:https://www.usgs.gov/faqs/what-are-band-designations-landsat-satellites?qt-news_science_products=0#qt-news_science_products
With all the correct bands selected hit the ok button.
Change the resolution to highest
Check the box for place each band in separate band file.
Click run
The image should be displayed in the QGis project. The colouring might look weird and that is because it is currently a false colour composite. To change it to a natural colour composite:
Right click on the Virtual layer-> select properties-> select symbology
Select Band 04 for red
Select Band 03 for green
Select band 02 for blue
Click apply and then ok to close the window.
Now it should be a natural colour composite.

Trim to Mask Area

It is possible to do the classification for the entire Landsat image but it could take a long time so for this tutorial we are going to trim the raster to a mask layer. This will allow us to classify a smaller subset of the region.

To do this a new layer needs to be made:
Layer tab-> Create layer-> New shape file layer
The geometry type can be a polygon, but in this tutorial we are going to simulate a study site and create a buffer around it.
Geometry type should be point. Click ok.
To create the point click:
On the new layer-> toggle editing (pencil icon)-> Add point feature (button two down from toggle editing)-> click where you want the mask area to be centred -> set the point id to 1 -> click ok ->Click save edits (between toggle editing and add point feature) -> click toggle editing again.

Make the buffer by going to:
Vector-> Geoprocessing tools -> buffer
The input layer will be the point layer you just created and select the distance you want to classify, this will be the radius from the point. I used 10km, if km is not an option and it is in degrees this can be changed by changing the projection. Click run. this will draw a circle on your map.

To clip the raster to the new buffer layer go to:
Raster-> Extraction-> Clip raster by mask layer
Input layer will be the virtual raster, mask layer will be the buffer. Click run.

Unsupervised Classification and Filtering

Time to classify the image. You need the SCP tool. To download it go to:
Plugins-> manage and install plugins-> search up scp-> select the semi-automatic classification plugin-> install-> close.
to use the SCP tool: SCP->Band set. This will open the tool. Under the Multiband image list option right at the top, go to the drop down menu and select the Clipped (mask) band created. Hit the refresh button on the right (second from the top), select all the bands and hit the Add bands to bandset button. Those steps will bring all the bands into the tool.
Next step is to go to the clustering tab in the tool.
The inputs will be:
method: K-means
Number of classes: 5
Use random seed signatures
All other settings can be left default for now. Hit run, it will prompt you to save it so navigate to the correct folder and hit save. The classification might take a few minutes be patient.
SCP will classify a box around the buffer, don't worry we will take care of that later
I ended up with 5 classes: water, land, box around the buffer and two types of ice.
There might be some speckling in the classification; tiny regions of different classifications, this is likely due to noise in the imagery. To remove the noise open up the SCP tool again:
post processing tab-> Classification sieve. Set the threshold to 25 (makes the smallest area classified 25 pixels) and set the pixel connection to 8 (this means pixels touching on the diagonal are considered connected and in one cluster). Hit Run. You will be prompted to save the output, name it and hit save. The effect this will have is a more smoothed out classification.

Polygonization (Raster to Vector Conversion)

Rasters can sometimes be difficult to use when calculating land cover statistics about a region so a good option is to transform the raster layer into a vector layer. This can be done with the built in polygonization tool. This transforms every cluster in the classification into a separate polygon. To do this go to:
Raster-> Conversions-> Polygonize. The input layer will be the filtered layer created in the pervious step. Hit run.
The new polygon layer may end up one colour. To represent it as the different land use types:
right click on the layer-> Symbology-> Categorized->Set the value to DN (fid is just the id of each individual polygon DN is the value taken from the raster)-> Classify-> ok.

Field Calculator

To get some more useful information from this new poligonized classification right click on the layer and open up the attribute table. The DN field corresponds to the number of the land cover from the raster version of the classification but that is difficult to understand at a glance so we are going to add a new field that displays the name of the land cover type.
Open field calculator (from the attribute table screen)->Create new field-> name it landcover-> Output field type, text (string). In the expression box type in: CASE WHEN "DN" IS 1 THEN 'Land' WHEN "DN" IS 2 THEN 'Bound' WHEN "DN" IS 4 THEN 'Water' ELSE 'Ice' END
This expression will correspond all the DN values to a name that the field calculator will put in a new field. Change the numbers as necessary for you classification as it may not end up being identical to mine. Click ok.
this will add a new field with a land cover name for every polygon.

The SCP tool classifies a box round the entire region and now is a good time to get rid of that:
Click on the title for the field we just made, this will sort by name-> click on toggle editing in the attribute table-> select all the rows you want to delete (in my case everything named bound)-> hit the delete features button.

Now we will calculate another field, this field will calculate the area for each polygon. Once again open up the field calculator:
Create a new field-> name it area-> Output field type (decimal number)-> Open geometry tab, select $area (alternatively type $area into the expression box)-> click ok.
A new column with the area for each polygon will be calculated. The attribute table can be exported as a CSV for further analysis. To do this:
Right click on the layer-> export-> save feature as and fill in the required information.

Editing and Merging Polygons

Some of the polygons may be incorrect due to incorrect classification. There is luckily a simply way to change that. You can use the select features tool (should be in the top tool bar) and simply click on the polygons you wish to change. If you want to select multiple at a time hold down the control key. Alternatively if you know the polygons that need to be changed go into the attribute table and select them there. Next step is to open the attribute table. If you only want the selected features to show go to the bottom left of the attribute table and select show selected features. To change the type of land cover open the field calculator:
Make sure the only update selected features box is checked-> update existing field-> select DN-> in the expression box do the math to make the DN value correct for the land cover type that is supposed to be, in my case it is: "DN"-3 or "DN"=1 both will work since DN is four when it is supposed to be one-> hit ok.

The land cover field also needs to be updated because it will not update automatically. Open up the field calculator again and update the land cover field. Simply copy and paste in the same formula you used the first time to assign the land cover names. field CASE WHEN "DN" IS 1 THEN 'Land' WHEN "DN" IS 2 THEN 'Bound' WHEN "DN" IS 4 THEN 'Water' ELSE 'Ice' END -> click ok.
Hit the save button at the top of the attribute table and hit toggle editing to turn off editing.

The classified region is still being displayed based on its DN values so to make it easily interpreted, go back into symbology, change the value to the land cover field, hit classify and then ok. This is the same process you did earlier. If you had two different types of ice like me this will now cause them to be displayed as the same colour. With the the two different types of ice being the same colour, the region looks very crowded. This can be helped by merging the polygons into a bigger polygon.
Select the polygons you want to merge (either from the map or attribute table)-> toggle on editing-> Right click on any tool bar and make sure to activate your digitizing tool bar as well as your advanced digitizing tool bar-> merge selected features (on the advanced digitizing tool bar)-> click ok. This will merge the features.

Map Production

Before making a map now is a good time to change the display colours of the classification. To do this go to symbology and select a new colour for every category.
To make a map got to project-> new print layout-> name the layout-> hit ok.
Next step is to add the map to the layout. Add item-> add map-> click and drag to draw the map perimeter. You can turn layers on and off from the layer tab in the main QGis window and the map will auto update.
There is a wide variety of tools to make the map more appealing. Some important items to add can be found in the add item tab.
Some good items to include:
Scale bar
North Arrow
Please see the video walk though for more details about map creation. There are also lots of other good resources out there for map creation.

Congratulations! You have completed this unsupervised classification tutorial!


Unsupervised Classification using SCP: https://fromgistors.blogspot.com/2021/01/unsupervised-classification-using-scp7.html
USGS Earth Explorer: https://earthexplorer.usgs.gov/
Loading Landsat 8 into Qgis: https://www.youtube.com/watch?v=mBk2VIMawRE
Raster Conversion: https://docs.qgis.org/3.22/en/docs/user_manual/processing_algs/gdal/rasterconversion.html?highlight=raster%20vector
USGS band designations: https://www.usgs.gov/faqs/what-are-band-designations-landsat-satellites?qt-news_science_products=0#qt-news_science_products