3.1 The concept
The need for a single site for storing and disseminating public French groundwater data led to development of the ADES database (www.ades.eaufrance.fr), initiated in 1999 and operational in 2002. The ADES system had to accommodate the new data acquired for WFD monitoring, but also the historical data in France that complied with the national Open Data Policy. ADES is the result of a joint effort of the French geological survey (BRGM, in charge of the development, maintenance and operation of the ADES database), the Ministry in charge of Ecology, the Office of French Biodiversity (OFB), the Ministry of Health, the Water Agencies, and the regional environment directorates (DREAL).
The ADES system is based on monitoring networks for either groundwater quality or quantity and belonging to a single producer. Each network is focused on one of the three possible purposes of groundwater networks: general knowledge, use and impact. The networks contain measuring stations to which the producers contribute data. A station monitored for water quality can belong to several networks, managed by different partners simultaneously. However, a single producer can also monitor a single observation well at a given time, or different producers can follow each other over time. This management rule is linked to the excessive complexity of managing the measurement references of various potential producers. Moreover, today, the trend being toward equipping structures with continuous data-acquisition units, there is no longer any point in having several organizations monitoring the same structure.
The ADES database currently contains several million data on groundwater quality and quantity, acquired from more than 245 unitary networks covering more than 61,800 stations monitored by over 180 organizations. In addition to these unitary networks, there are also 67 meta-networks, groups of unitary networks containing all or part of the points of the primary networks. These meta-networks serve to combine, in the same network, different points having the same purpose, but being monitored by different producers. The reasons for grouping points in the same meta-network can be related to geographical constraints. For example, the national meta-network "monitoring control of the chemical status of groundwater in France" combines monitoring control networks at the basin scale, or that are related to a technical or thematic group. For example, the water-level drought alert network of the Artois-Picardie basin (northern France), corresponds to the grouping of specific points monitored in different water level networks in the basin. The current data producers are territorial or local authorities, public institutions as well as scattered Government services (water agencies, regional health agencies, departmental and regional councils, BRGM regional offices, etc.). They also include some private unions and associations.
The information contained in ADES concerns networks, measuring points, and the results of quantitative (water levels) and qualitative (physical and chemical parameters) measurements. When setting up the ADES database, several challenges had to be met, such as being a tool for gathering and storing information on groundwater; being a communication tool allowing the availability of public groundwater data and their exchange on a single and free website; or being used as a water-management tool covering the needs of local water management and the priorities specified in the European WFD, i.e. the monitoring of the state of groundwater resources and the implementation and evaluation of water management plans. Having all public data on groundwater in the same website facilitates the access to and processing of groundwater issues in France, and offers many study and research possibilities. Scientists with the expertise of interpreting and integrating such data will be able to answer relevant questions on a regional or even national scale.
3.2 Water-exchange data models
Read et al. (2017) and references therein, highlighted the need for standardized databases. Therefore, in order to aggregate and standardize data from different producers, data models shared by all have been set up. In France, the National Water Data Secretariat (Sandre; www.sandre.eaufrance.fr) develops a common language of water data and repositories for the Water Information System (SIE). It has established French data models for the exchange of data on groundwater quality and quantity. Data models are structures for exchanging data and their metadata (“data on data”), essential for understanding and interpreting the data. Such data models are described in exchange scenarios, specification documents that describe how to exchange data in a specific context, and are used for exchanging data with a specific format. These documents detail the semantics, the mandatory and optional information, the syntax, the exchanged data, and the technical and organizational modalities of the exchange. Data dictionaries are specification documents that describe and clarify terminology and data for a particular field. Several aspects of the data are processed, e.g., their meaning, the essential rules for their drafting or codification, the list of values they can have, the persons or organizations that have the right to create, consult, modify or delete them, etc.
In ADES, data and metadata are stored in tables, and each attribute is located in a column or field. Keys are used for establishing relationships between the different tables (e.g., the station code is used for linking several tables). Thus, the results are linked to the sites and their characteristics (water body, for example). For banking data in ADES, data producers must inform the animation unit, which gives them specific rights on the monitoring network(s), the theme(s) (quality and/or quantity), as well as on the measurement station(s). In addition, they must respect a specific exchange model for each type of data (quality or quantity). Consistency checks are carried out at the time of data integration, to ensure that the data model is respected. If the information transmitted does not comply with the data model, it is refused, and the producer is informed. He must then correct the errors, in order to re-submit the data. As mentioned above, the data model includes the metadata necessary for an accurate description of the data. For example, chemical data cannot be stored without specifying the associated measurement unit. Most of the metadata are in mandatory fields and data cannot be stored if these fields are empty. Indeed, as Sprague et al. (2017) pointed out, metadata are essential for understanding and therefore exploiting the data. In addition, metadata must be well-informed and unambiguous, as data become unusable without the transmission of information on the accompanying metadata, and the efforts deployed for the data acquisition are useless.
For all quality data, i.e. for each chemical analysis, 40 fields, including the measurement itself, are requested, and 18 are mandatory for data submission, such as measuring point, sampling date, data producer, parameter, unit, analytical method and laboratory, etc. The producer also can complete information on the samples containing the different analyses, such as on sampling depth, flow rate, purged volume, sampling and preservation methods, etc. For groundwater levels the data model is simpler, containing eight mandatory fields (date, measuring method, nature, etc.). Among the mandatory fields, the data producer must indicate for each data item its status and qualification. The status of the data indicates the progress of data validation, from raw data (from the acquisition process and not having undergone any examination) to data controlled at different levels (value seen by a human or automatic expert system up to a reported value). If the data were checked, the qualification indicates whether it is correct, uncertain or incorrect. The producer can at any time change the status and qualification of the data in ADES. For example, he can store raw data in the database and change their status and qualification once the necessary consistency checks are done. Similarly, he can at any time correct data or metadata with errors (on data, measuring unit, date, etc.). In fact—since each data item is linked to the producer who banked it—if a value seems doubtful, the producer can be contacted at any time to verify the value, or correct it if necessary.
3.3 The referential
All metadata are coded according to a precise reference system (referential) shared by the different actors of the French water-information system, according to the specific themes of each system. The national Water Data Secretariat (Sandre) administers most of the shared repositories, such as parameters, units of measurement, fractions analysed, support, analysis methods, etc. Other reference systems, specific to certain themes, are developed by other organizations and only distributed by the Sandre. For groundwater, BRGM manages the reference system for French water points as well as the reference system for hydrogeological entities. For the WFD, the regional Water Agencies ensure the management of the referential of groundwater bodies. Hereafter, we will describe the reference systems for hydrogeological entities and for water points.
3.3.1 Hydrogeological entities referential
Groundwater management responds to economic, societal and environmental issues that require a suitable knowledge of the subsurface and its physical properties, in order to assess the status of the water and to guide any actions to be taken for resource preservation. The hydrogeological referential database of aquifer boundaries (BDLISA, www.bdlisa.eaufrance.fr) is a national tool providing a scientific framework in the field of hydrogeology. BDLISA is a cartographic referential of the Water Information System. This database has been defined and elaborated at the “local” 1:50,000 scale, so called as all (hydro)-geological objects of the French territory have been mapped and described at this scale. The division into hydrogeological entities is a finer division than that of the water bodies, which are the reporting units for the European WFD, as a groundwater body consists of one or more hydrogeological entities (www.wiser.eu).
Most points monitored in ADES are related to a hydrogeological entity, BDLISA and to a water body, indicating in which formation the measurements were made, for comparing the data between points of the same entity, and identifying the natural geochemical background of each entity as well as potential pollution problems.
3.3.2 Water points referential
Each measurement site is referenced in the repository of water points (Fig. 1), which is part of the French national subsurface database (BSS). The BSS is organized and managed by BRGM, the French geological survey, and contains information on underground structures (boreholes and wells) and work carried out in France for more than a century. Most of the data originated from drillers, operators, and engineering and design departments, and are freely accessible via Infoterre (http://www.infoterre.fr/). Each site is identified by a unique national station code (BSSID). The BSS database contains 888,900 points of which more than 523,700, or about 60%, are water points.
The information stored in the subsurface database concerns four main types of parameters that are all sent to ADES (Fig. 1). They are: i) Positioning of the measurement point (department, city, coordinates, altitude, etc.); ii) Technical information of the point (depth, technical sections and equipment for boreholes, etc.); iii) Geological information (cross sections, lithology, lithostratigraphy, chronostratigraphy); and iv) Hydrogeological information, such as the nature of the water point (borehole, spring, etc.), the type of reservoir (free, confined, semi-confined, etc.), the state of the water point (operational, plugged, etc.), and the water body and the hydrogeological entity.
Groundwater points have been linked to groundwater bodies and the BDLISA reference system for several years. This work is carried out through analysis of the technical, geological and hydrogeological sections of the boreholes, when available, and of the known hydrogeological context. These links are fundamental for knowing which water body or hydrogeological entity the data refer to. In particular, they allow the processing of data at the scale of water bodies, as required for WFD reporting. Understanding of the link between water points and aquifers has enabled, for example, the work by Valdes et al. (2014), who used chemical data from ADES to determine the average geochemical properties in the Chalk aquifer of the Upper Normandy region in France. Of the 523,700 water points in the BSS, nearly 154,300 are attached to a hydrogeological entity (about one-third of the points). Similarly, over 248,700 (nearly 50%) are linked to a water body. Of the more than 61,800 water points monitored in ADES, over 85% have a declared hydrogeological entity and nearly 97% a water body.
The information relating to the monitoring of water points is stored in ADES. This includes the start- and end dates of monitoring, the water-level measurement reference (top of casing, ground level, etc.), the frequency of data acquisition, the method used, etc.
3.4 Contents of the database and overview of the data set
3.4.1 Measurement points
More than 61,800 water points with data are part of the 245 unitary monitoring networks in ADES. Among these, over 4,360 are used for measuring water levels, more than 58,600 are quality measuring devices—called “qualitometers” in ADES—and about 1170 incorporate both types of monitoring. These stations are located in metropolitan France (Fig. 2) as well as in the five French overseas departments (Reunion, Martinique, Guadeloupe, French Guiana and Mayotte). In addition to these stations, BRGM, as the national geological survey, has data on 17,600 observation wells it manages and that are not part of other monitoring networks. In 2020, BRGM decided to make available via ADES the data from these additional stations. The spatial distribution of the measuring stations of the ADES networks (Fig. 2)—and thus of the data—varies according to several criteria, including the local geology, the presence, type and extent of aquifers, the existence of groundwater access points, groundwater use (mandatory monitoring for drinking water), the presence and monitoring of potential sources of pollution, the regulatory texts governing groundwater monitoring (national water laws, European WFD, etc.).
The most monitored water bodies are those in the sedimentary and alluvial domains, which are often large aquifers under high anthropogenic pressure. For example, in the two largest sedimentary basins, Paris and Aquitaine basins, agriculture is highly developed, leading to pressure on groundwater resources. Furthermore, cities and industries are developed along rivers, creating much pressure on alluvial aquifers.
The evolution of the number of new measuring stations declared each year in ADES illustrates the different policies successively implemented over time, as well as the development of specific networks. For quality monitoring, for example, the evolution of the number of points between 1980 and 2020 (Fig. 3) shows that groundwater monitoring in France was initiated well before the implementation of the European WFD in 2000. Moreover, the large increase in the number of quality-monitoring stations in 1996–1998 is related to the start of data transfer from groundwater-quality monitoring networks for drinking water supplies. For 2003–2004, the effect of implementing the WFD monitoring networks is visible. Since 2004, the number of new “qualitometers” has constantly decreased, with an average of about 400 new “qualitometers” per year over the past 10 years. The reasons for this slowdown in the number of new installations is that the coverage imposed by the WFD has now been reached in France, and that the cost of such monitoring is significant. Adding new monitoring stations for the WFD is not yet justified by the available means and the expected results. The new stations are either replacement ones, after abandoning a station that has become inaccessible, or new ones belonging to new monitoring networks, generally supported by watershed associations or communities of municipalities that feel the need to make their data available.
3.4.2 Water-level data
The quantitative data are groundwater level, expressed in depth and/or in NGF (French height reference system) level. In the current system, only one data per day can be stored per point. Generally, for points equipped with automatic acquisition systems, the measurement time step is hourly. In order to respect the constraint of one new data per day in the database, the value of the daily maximum in NGF is stored because it represents the lowest depth and therefore the value closest to static water level. More than 17 million water-level data from more than 4,360 stations were stored in ADES in January 2021. For recent years, an average of 1 million data per year were stored in ADES. About 2% of the points have only one measurement. Such points often were part of transitory networks whose data were used for constructing water-level maps. The average number of data per point in ADES is about 3900, but the observation well with the most data (over 18,200 water-level measurements) is located in the south of France and has been monitored since 1970 (point 0999X0521/P4B).
The observation well with the longest follow-up is located in northern France, dating back to March 1899. It is still active and has nearly 11,200 data in ADES (point 00147D0218/P1).
Moreover, as mentioned above, the data from 17,600 BRGM observation wells were added to ADES in 2020. The number of measurements associated with these points exceeds 1.5 million. However, for 40% of these specific points, only a single measurement was available and entered in ADES. Generally, such a measurement was made at the end of drilling, or was needed for making a water-level map. The oldest water-level measurement dates back to 1829, when a 70-m-deep borehole was drilled north of Paris in Lutetian limestone. Of the 17,600 points, only 40% have more than 10 data, nevertheless, on some points, the chronicle can be much longer. Indeed, point 00171X0007/P1 in the north of France has almost 11,000 data and was monitored from 1965 to 2001.
An important part of today’s water-level monitoring stations are part of the Water Framework Directive network for which the BRGM is in charge of more than 1,620 stations out of the 1,770 active stations in the network. This network also belongs to the OZCAR-RI, French Network of Critical Zone Observatories (www.ozcar-ri.org).
As technology has evolved over time, it was possible to move from one-time measurements (usually monthly) to continuous data acquisition with recorders that had to be read regularly. Subsequently, the generalization of remote data transmission has eased data availability. Stations can now be remotely interrogated as often as desired, thus reducing banking and broadcasting time. From loading data into the ADES database, first on a monthly basis, then on a half-monthly or even weekly basis, the current trend is to make "raw" data available in real time, which has now become a necessity, given society’s current needs and expectations (quantity and quality of water resources, pressures, climate change, drought, rising or falling water tables, etc.). For the 1,500 stations equipped with GPRS technology (General Packet Radio Service), the raw hourly data of the previous day are sent each morning to the BRGM site, for later display in ADES. Such raw data are broadcast in "quasi" real time and have not been validated. Conventional data-banking and -validation processes follow the next days. The interest of this modus operandi is to have daily access to water-level data in monitored aquifers without having to wait for validation. The data thus acquired and made available in quasi real time are, for example, taken up and used by the website Météeau Nappes (https://meteeaunappes.brgm.fr), a tool for real-time monitoring and forecasting of water-table data. The availability of real-time data from environmental sensors is a challenge for exploiting the data as quickly as possible (Wong et Kerkez 2016).
3.4.3 Quality data
The amount of chemical analyses on groundwater is nearly 102 million, acquired on more than 58,600 different measuring points. Over the period 2015–2019, the average number of new quality data was almost 10 million analyses per year, varying between 7 and 16 million. The average amount of data per point exceeds 1,700. Only 0.6% of the points have only one data. The point with the most data is a spring in southern France, which is part of the quality monitoring network for reporting to Europe in the context of the WFD. It presents over 70,800 analyses acquired between 1988 and today from more than 300 samplings. Over time, the amount of data has constantly increased. In 1980, 25,000 quality data are available in ADES. About ten years later, between 1992 and 1993, the amount of data in the bank per year is ten times higher (about 250,000 data). This amount continues to grow steadily from year to year, reaching 2.6 million data per year in 2008. From this year on, the amount of data grows more strongly to reach 93.3 million per year in 2018. This increase in the number of analyses is related to the increase in the number of qualitometers installed and in the number of parameters analysed. The apparent decrease in the number of analyses for 2020 is linked to the fact that there is a variable delay, depending on the producer, between the measurement being made and its banking.
All these data can be used freely for public or for scientific purposes, as done by Barbier (2009) who defined the link between the water cycle and the chemical composition of groundwater in France, using the ADES database. The water-quality data cover physical, environmental, microbiological and chemical parameters, including further parameters related to the natural mineralization of water (major, minor and trace ions) as well as organic micropollutants (pesticides, pharmaceuticals, industrial pollutants, etc.). The list of parameters comes from the national parameter repository administrated by the Sandre, which contains 5,550 different parameters and is constantly evolving. As the different SIE (Water Information System) partners and water data producers request new parameters, these are added to the repository. Regularly, ADES adds via an export of the repository all new parameters for groundwater. In ADES, there are currently 3,880 different parameters related to groundwater. Data are available on about 2,770 different parameters and 60% of them have more than 1,000 individual data records. Around 400 parameters have more than 10,000 data.
The ten most measured parameters (between 778 000 and 434 000 data), ranked in descending order, are: Hydrogen Potential (pH), nitrate, ammonium, water temperature, nitrites, sulphates, chlorides, turbidity, iron and conductivity. These physico-chemical parameters are representative of general characteristics of groundwater, and only iron is an inorganic micropollutant. These are classic parameters that have been monitored in groundwater for a long time. They represent nearly 5.5 million data, or 5.6% of the bank's data.
The distribution of data by parameter class indicates that 83% of the analyses in ADES correspond to organic micropollutants with over 83.8 million data. Chemical parameters of natural origin represent 12% of the data (12.1 million data). Physical (water temperature, conductivity, etc.), environmental (odour, colour, etc.) and microbiological parameters represent 5% of the data set.
The ten most measured organic micropollutants (between 320 000 and 271 000 data), which are all pesticides, ranked in descending order, are: atrazine, simazine, atrazine desethyl, terbuthylazine, atrazine deisopropyl, diuron, isoproturon, chlortoluron, linuron and metolachlore. Despite their recent appearance and thus their shorter follow-up, they represent over 2.8 million data in ADES. The parameter with the highest number of analyses in the database is atrazine, with nearly 312,000 data, on nearly 30,500 “qualitometers”.