Over the years, various data processing techniques have been developed for the transformation of surveying field data into meaningful information. Surveying data are mostly numeric, hence, are amenable to mathematical computations and manipulations. Surveyors use a wide range of complex mathematical models for processing data from different field operations [3]. These help to simplify the computational process. For instance, it is cumbersome to reduce and adjust levels of each chainage in a levelling network running into kilometres. Similarly, reductions and adjustments of data from triangulation and classical traverse networks could be arduous and time consuming. In addition, the volume of data obtained from bathymetric operations (sounding data), could pose challenges that could affect timely processing of data especially, when tidal reductions to each of the sounded data is considered.
Because these data processing operations are often repetitive, it led to the development of programming codes based on various underlying logic and mathematical models for each operation, to facilitate the speed and automation of the computation process. Programming languages such as FORTRAN, and Visual Basic have been deployed to develop some of these automated packages.
Despite the growing number of programming languages and their capabilities in handling some basic field survey data processing and computations, surveyors in developing countries such as Nigeria are still faced with the challenge of having to manually reduce, adjust and make some basic surveying computations. This challenge stems from the fact that most available programs do not support raw data presented in formats in which the surveyors record their field books on site. And as such, the surveyor tends to manually reformat his data to suit the format of the field book.
2.1 Python language and spatial data analysis
In the past, Fortran, Visual Basic were mostly used to develop surveying data processing applications because back in the days, the languages were very suited for scientific programming. However, these languages have their limitations that make newer object-oriented programming languages such as Python, which are more user-friendly, better suited for programming Geoinformatics tasks (spatial data processing, analysis, and visualisation). Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Its high-level built-in data structures, combined with dynamic typing and dynamic binding, make it very attractive for Rapid Application Development (RAD), as well as for use as a scripting or glue language to connect existing components [4]. The programming language emphasizes code readability and concise syntax that lets a programmer write applications using fewer lines of code than other programming languages require. A programmer can also use a coding style that meets various needs, given that Python supports the functional, imperative, object-oriented, and procedural coding styles. In addition, because of the way Python works and because of its many built-in libraries, it is used in all sorts of fields, and it does lend itself to educational and other uses for which other programming languages can fall short. Furthermore, Python supports modules and packages, which encourages program modularity and code reuse. The python interpreter and extensive standard library are available in source or binary form without charge for all major platforms and can be freely distributed. Often, most programmers prefer working with python because of the increased productivity it provides. Since there is no compilation step, the edit-test-debug cycle is incredibly fast and easy.
Based on these facts, Python has become a fast-growing programming language in various fields with good potential in the geomatics field. In addition, Python has the capability to process data presented in different formats including text files (e.g. csv), spreadsheet (Excel or similar formats), hence, it is amenable for processing geomatics field data which are often presented in these formats. Hence, the reason it was chosen as the programming language for this research.
2.2 Related Works
Automation of survey data processing and analysis has been attempted and achieved by several studies. Various programs for the reduction and adjustment of some field survey data and general computation of other survey problems have been developed. Diverse methodology, programming languages, levels of data processing, have been deployed, depending on the type of mathematical model used and scope of processed data. Some previous studies such as [2, 5–6] in their research, deployed Visual Basic Language, while [7] used MATLAB language to develop routines that reduce traverse network data. These studies deployed the researcher’s insight to tailor the program to fulfil the expected data transformation and analysis needs of the user. Most studies that dealt with traverse computations of a closed-loop network makes provisions for the computation of area of the network, e.g., [2, 5–6]. Other computation routines have been developed to automate the reduction and adjustment of level networks such as [2, 8–9].
Despite the progress made by the previous studies, a special interest lies in the seamless end-to-end automation of survey data processing to minimise errors that could be introduced during data processing. In most of the previous studies reviewed, the researchers developed their programs to either allow the input of single data at a time, or for field data to be imported in a specific format that does not always match the common formats used to record the data in the field. Hence, the field data must be transformed (manually pre-processed) to fit into a format that is compatible with the processing software. These two approaches are flawed. In the former (importing individual data lines) impedes the seamless automation of data reduction and adjustment, while the later constricts the user to a particular file format, which the user may be unfamiliar with. In either of the cases, the speed at which the user adapts to the use of the software is constrained and might entail a steep learning curve that could discourage the potential user.
More so, programs that are written to prompt for user inputs, could lead to introduction of typo errors (blunder) during data input. For instance, the SurveyBuddy application developed by [2] adjusts its traverse by prompting the user to enter field data records one at a time. As the user populates the data input, the program also populates its database. In as much as the database of the program is displayed for the user to visualise and validate the data, it falls short in replicating the same data format in which the field surveyor would normally complete his field sheet, hence, he may not easily spot any data input error in the course. Similarly, the TravCAD software developed by [6] requires the user to transform the data into formats acceptable to the package, which is quite different from conventional field sheet format. This is also the case for the software developed by [7]. The WOLFPACK software also falls short in this aspect. These limitations hinder user’s deployment to the use of the software.
Furthermore, most of the previous research that developed area computation programs, failed to develop a standalone program for area computation. Even though the researchers integrated area computation in their closed-loop traverse network, the area computation module is only activated where the user wishes to compute the area of a closed traverse. However, area computation is a very important aspect of a field surveyor's operation and might be required in other projects other than outside a traverse. Hence, there is the need to develop a standalone program that computes the area of a network even though the coordinates of the bounding region were not obtained during a traverse operation.
In the case of bathymetric data processing, available packages are integral to bathymetric instruments. Hence, they are designed to process data from the associated instrument. For instance, the Innomar ISE software can post-process data acquired using Innomar's SES-2000 parametric sub-bottom profilers. This means that similar data from other sub-bottom profilers cannot be processed by this software. Similarly, this limitation of data processing being constrained to only such data obtained using that specific hydrography instrument impedes the general processing of sounding data. To the best of our knowledge, no paper has attempted the development of a survey application that could process data generated from different hydrographic instruments, using a consistent format. Therefore, there is need to develop a data processing application that is agnostic to acquisition instrument.
In the course of research, we found out that [10] and [11] had deployed Python programming language for some geomatics data processing. While [10] deployed the language for scripting in ArcGIS geoprocessor, [11] used it for planetary data processing. However, we were unable to find any previous research that has used Python to develop a comprehensive geomatics application that could process data acquired from various geomatics operations (traverse, levelling, bathymetry, resection, and intersection). This is despite the fast development of the Python programming language and its vast potential in geospatial data management. This would provide good traction for other Python programmers in the survey profession to investigate the possibility of deploying the Python language for geospatial data processing, analysis, and visualisation. A key consideration in this study was to develop an application that is cheap, user-friendly, and capable of processing data presented in different formats and from diverse instruments.