Automating SAGA Workflows Using Command Line Scripting

From CUOSGwiki
Jump to navigationJump to search


Please note that this tutorial is not quite finished. Since there isn't a lot of material out there on this topic, it will hopefully prove useful to anyone researching this topic, and future contributions/edits are certainly welcome!


Introduction

Automating GIS tasks offers industry a means to efficiently repeat the same task. Doing so can reduce manual errors in processes as well as increase productivity. Many platforms offer this feature, but this tutorial will focus on the System for Automated Geoscientific Analyses (SAGA), created by the Department of Physical Geography at the University of Hamburg in Germany. It is one of the more popular free open-source software and offers both a command line option as well as Python integration for executing GIS tasks.

This tutorial is aimed at those who have some comfort using the command line interface to execute commands and some basic Python skills.

Objectives

1. Understand how to call SAGA tools from the command line

2. Understand how to automate sequences of tasks using the command line and Python

3. Provide implementation differences between Windows and Unix environments

Getting Started

Installation

To install the software, please use the links below to install SAGA or Python as required.

Logo saga.png Python.jpeg

Set PATH variable

To access the SAGA tools from a command line environment, you will use the saga_cmd tool (saga_cmd.exe on Windows). To make running this easier, you can add the path to the SAGA directory to your system path so that your operating system knows where to find saga_cmd. To do so, use the following commands depending on your operating system, replacing [SAGA_PATH] with the path to the directory where SAGA was installed. Otherwise, you'll have to enter the full path every time.

Setting the PATH variable
Windows Mac / Unix
set PATH=%PATH%;[SAGA_PATH] export PATH=$PATH:[SAGA_PATH]

SAGA GIS Command Line Interface

SAGA GIS provides access to their tools via a command line interface which allows a user to set the parameters for and run a tool directly. By chaining these together in a script file, a user can build an analysis workflow which can be run repeatedly on-demand or scheduled using Task Scheduler or cron. This section discusses the basics of how to use this interface; for full documentation, refer to the SAGA tool documentation which has an example of how to call the tool from the command line at the bottom of the page.

Accessing the Help & Module List

The results of running the saga_cmd --help command

There are several commands that can help get you started using the command line interface, which are listed below. The SAGA tool documentation is also a useful resource to get started with.

saga_cmd
When run with no parameters, this will display a list of available libraries.
saga_cmd --help
The --help flag will show you a summary of how to use the SAGA command line interface.
saga_cmd --help LIBRARY_NAME
When you also specify a library after the --help flag, the interface will show you a list of tools in that module with numbers for each.
saga_cmd --help LIBRARY_NAME TOOL_#
If you also specify a tool number, the help for that tool will be shown including the list of parameters.

Try out these commands to see if you can get the list of tools in the ta_morphometry and grid_spline libraries, and look at the parameters for the "Slope, Aspect, Curvature" and "Cubic Spline Approximation" tools. We'll use these in the next section.


Executing a Tool

The results of running the Cubic Spline Approximation tool on a DEM from Carp, Ontario. The results are stored in GridDEM.sdat.

To execute a tool, you have to give the command line interface the name of the library and the tool, followed by a set of parameters for the tool. These are documented both online and in the tool help.

For example, if we have a set of point data as a SHP file and want to create an interpolation of it using the Cubic Spline Approximation tool (which is tool #6), we can look at the parameters for this tool to see what we'd need to pass SAGA to execute the tool. These are the equivalent of what you'd need to fill in in the graphical interface to execute the tool. A select few are shown here, the full list can be obtained from the SAGA documentation if needed.

SHAPES
An input file of point data
TARGET_TEMPLATE
The output grid system
TARGET_OUT_GRID
The output file for gridded data (defaults to SAGA's .sdat format)


To call it on data located at [INPUT] and store the output as [OUTPUT], you would enter the following command. This will create [OUTPUT].sdat and store the interpolated grid at that location.

saga_cmd grid_spline 6 -SHAPES:"[INPUT]" -TARGET_OUT_GRID:"[OUTPUT]"


Chaining Tools Together

One advantage of the command line interface is you can execute multiple commands in sequence. This allows you to build a workflow of tools to be executed on an input dataset so that you get repeatable results without tedious error-prone manual work. There are several methods for doing this; here, the Windows batch script is described, but it is also possible to create a bash script on Unix environments that operates in the same fashion. Python can also be used to chain operations together

For example, let's say we have point data as our input. We want to run a cubic spline interpolation and then create a slope grid from the interpolated data, and we want the user to be able to specify the input point dataset as well as an output folder that will contain the interpolated DEM and the slope data. On Windows, we can reference parameters using %1, %2, etc. We can also specify a tilde between the percent sign and number (e.g. %~1) to remove any quotation marks the user might have added to handle paths with spaces in them. Assuming the first parameter will be the input file and the second the output directory, we can create a batch file like this:

saga_cmd grid_spline 6 -SHAPES:"%~1" -TARGET_OUT_GRID="%~2\GridDEM.sdat"
saga_cmd ta_morphometry 0 -ELEVATION:"%~2\GridDEM.sdat" -SLOPE:"%~2\Slope.sdat"

When the batch file is called with appropriate parameters, it will call both of the specified SAGA tools in order, using the output of the first command as an input for the second. In this way, we can chain together tools to execute a workflow using Windows batch files. An example command is below, assuming we called this slope_from_points.bat using CarpDEM.shp as an input and storing the output in the same directory.

F:\School\slope_from_points.bat F:\School\CarpDEM\CarpDEM.shp F:\School\CarpDEM

More complex workflows, including error checking on the input variables and looping over multiple files are possible, but require advanced knowledge of scripting batch files in your environment and are beyond the scope of this tutorial. A good reference for those interested is the WikiBook on Windows Batch Scripting or Advanced Bash-Scripting for those on Unix environments.

SAGA Python API

The Python API for SAGA requires compiling the API using Visual Studio and is therefore a process recommended for advanced users only who are familiar with compiling C++ code. If you're interested, you can follow the directions available at the SAGA Python API page. One advantage to Python is that you can retrieve the grid directly from the tool and store it in memory instead of having to store it in an intermediate file, which can improve processing time. The execution of a tool in Python is very similar to the command line, except there is more setup work to be done. Once you have a saga_api object, it's simply a matter of retrieving a tool definition, setting the parameters, executing it and accessing the output.

Additional Resources

For more information, you can read the following pages: