Goals & Contributions

Main Goal

Define a set of VGI users and data centered methods for the analysis of farmland biodiversity indicators. In this project, farmland represents areas dominated by agriculture, and includes cultivated areas, pastures and cropland/natural vegetation mosaics

Main Contributions

Two complementary and one interrelated scientific contributions, in statistical ecology and in GeoBI, which will investigate and combine data and users aspects of VGI.

(i) Statistic tools for farmland biodiversity indicators: New statistical tools [16;15;19;40-44] to generate meaningful farmland biodiversity indicators based on VGI opportunistic data will be proposed

(ii) Participative SOLAP design methodology. This participative approach will benefit from the abundance and skills of VGI users. We will integrate participative features of Group Decision Support Systems (GDSS) [12-14;32-34] in the design phase to increase user participation and improve decision-making. The project extends existing work, which deal with few OLAP skilled users producing data of high quality under a standardized framework

(iii) The above two contributions will provide for the first time a participative SOLAP analysis of farmland biodiversity [5;7;9] [8,6,17,39].

VGI data and users are available at the beginning of the project.

Data used

in the project comes from Biolovision[ and Observatoire Agricole de la Biodiversité.

Biolovision, managed by LPO, is a large set of databases about species localization and abundance. These databases allow anyone with an account to enter its observations (vertebrates and some entomological groups), allowing unprecedented data pooling. Biolovision represents 40 million stored data collected in France with a large  proportion in farmland. Biolovision data can be classified in two categories: (i) opportunistic data, (ii) standardized data, collected either by experts or non-specialists:

-Opportunistic data. Biolovision was primarily designed to collect opportunistic data, i.e. data collected without protocol or sampling plan. These data describe at least: the time (date and hour), real time geolocalization, observed taxon, and observer.

-Standardized data: These data are gathered with a predefined protocol and sampling plan, in the context of national schemes (e.g VIGIENATURE[2] [12p]) or local programs (e.g Atlas[BS4] ). Dedicated input masks are accessible to observers to load data, which are highly standardized.

Both data types, i.e. opportunistic and standardized, are collected using the same Web/Mobile application (Visionature[3] and NaturaList app[4]), and are stored in different databases according to regions. Biolovision exists in mainland and overseas France and a dozen European countries[5]. In this project, the conception and implementation of farmland biodiversity indicators will be done based on Biolovision data.

•The Observatoire Agricole de la Biodiversité (OAB) is one component of Vigie-Nature, managed by CESCO. This scheme is a standardized monitoring of biodiversity in farmland, in which data are collected by farmers. Started in 2012, the database holds 210,000 records collected on more than 2000 sites all over France. The OAB data will be added to the Biolovision data to implement the newly produced SOLAP analyzes.

Ecological objective and contributions

To monitor biodiversity decline, the need of standard biodiversity indicators has been emphasized by policy-makers, and the importance and potential economic value of biodiversity monitoring to build indicators is recognized [18]. Standardized protocols are recommended to monitor species abundance variations [48], and citizen science, where data are collected by volunteers, is a way to mitigate the funding issue. Opportunistic data, collected without protocol (such as butterflies and birds [17]), are another potentially valuable source of information on biodiversity changes, their number having increased greatly through the online citizen science databases (e.g. Biolovision). However, numerical data (ie abundances) have to deal with data quality issues: uncertainty resulting from sampling design and measurement methods. Such uncertainties may be of various natures: error in the raw data, gaps in data, classical statistical errors and their propagation in the chain of information processing. In the case of opportunistic data, the underlying sampling design is unknown: some observers may spend a short time in the field, whereas others will stay all day; some will record all species, whereas others will note only the most spectacular; some will stay on the same spot whereas others will cover a wider area. The value of these data to assess biodiversity trends is thus questioned for several reasons: geographical biases, lack of standardization of sampling effort, reporting biases, detection biases. However, it is possible to retrieve and quantify observation effort from modeling the observers’ distribution, the detection rate for a given effort, the error rate or the reporting rate of null observations. These issues will be addressed through three approaches: i) modeling what can be derived

from the data itself; ii) modeling the observation process from a proxy (e.g., modeling birdwatching effort from the number of adhesions to bird clubs); iii) using high quality observation data to statistically calibrate low or unknown observation quality data, so that we get more information from the two kinds of datasets than from the high observation quality data only. We will identify the limits of the spatial and temporal grain beyond which data cannot be extrapolated: for instance, models may not be projected because of lack of precision (in the same way that weather forecast have temporal and spatial limits). This knowledge of uncertainties and their propagation will thus be later used for down and up-scaling of the biodiversity distribution and dynamic. The high quality data used for calibration will be bird data produced by Vigie-Nature monitoring schemes, coordinated by the Museum national d’Histoire naturelle, and stored in the Biolovision databases. The Vigie-Nature Breeding Bird Monitoring Scheme (STOC-EPS) has more than 25 years of standardized monitoring, with 30% of sites in farmland

GeoBI objective and contributions

SOLAP systems are based on well-integrated and structured data, whose suppliers are known, as well as on a suitable geovisualization paradigm [30]. Ecological Spatial Data Warehouse models are usually complex [7;9]. Therefore, an effective design methodology is needed to correctly translate stakeholders’ needs into corresponding SDW models. Several design methodologies for (S)OLAP have been proposed in the literature [45]. These methodologies assume that stakeholders have common and precisely identified analysis needs to be transformed into SDW models. However, as already proved in the TSCF’s EDEN project [5p]  when stakeholders have different backgrounds (e.g. scientists vs. citizen) the design phase is complex and time-consuming, because of : i) stakeholders can have various analysis needs (e.g. spatial scale may be department or region), ii) different groups of stakeholders may define their needs using SDW models that are only slightly different (e.g. same spatial dimension but different spatial scales), iii) it may be difficult to work simultaneously on a project when organizations have different geographical locations. Therefore, using the existing methodologies implies to define a SDW for each analysis need and then merge these models. This process is usually not trivial since it does not necessarily represent consensual SDW models agreed by all stakeholders. To address this issue, we suggest an innovative SDW design methodology based on participative Group Decision-making Support System (GDSS). GDSS are designed to support group engaged in a collective decision process [34]. They represent a widely used collaborative technology that has proven to increase user participation and quality of decision-making [14]. They are intended to provide computational support to participative decision-making processes. The GRUS system [13], developed at IRIT, offers the basic services commonly available in GDSS and Collaborative Systems. Participative work allows users to exchange, produce, share and modify information and knowledge without physical or temporal barrier. These methodologies are used in several domains such as workflows, user interface and databases [51] [39], but not in SOLAP context.

Therefore, the main objective is to define a new generic and participative SOLAP design methodology. Our proposal should handle versioning of functional requirements models, agreement policies and aggregation of functional requirements models in order to facilitate the participative design of SOLAP models. The proposed methodology is participative since it allows groups of users to design their SOLAP models together. It is a generic methodology which could be applied in other contexts, such as environmental [or health issues.

Moreover: i) When the implementation of the SDW models is supported with an automatic implementation methodology, project duration and costs drastically decrease [1]. This issue is a mandatory requirement for the design of spatio-multidimensional models with several users, since the more users are involved, the highest are design and implementation costs. ii) An effective SOLAP analysis needs geovisualization methods [46]. For this reason, based on [11][4], we will define more effective methodology for the rapid design and implementation of SDW conceptual models and associated geovisualization methods. This is a secondary objective of the project. In the rest of the document for SOLAP model we mean: SDW model and its geovisualization.

Complementarity of contributions, and contribution issued from this complementarity

Both contributions of this project have the VGI context in common (Fig 1). The ecological component uses VGI data for the monitoring of farmland biodiversity (Fig 1: orange). The information system component deals with the integration of VGI users in the decision-making process (Fig 1: violet). As such, this project is inherently multidisciplinary: the ecological and information system approaches meet around the biodiversity application domain and SOLAP analysis (Fig 1: green), and enrich each other.

In the ecological context, our preliminary work [7;9] successfully integrates biodiversity abundance in SDWs. For that reason, the investigation of complex biodiversity indices such as Community Specialization Index in Small Agricultural Regions, using SOLAP technology, is a promising research issue. Ecological studies, academic or applied, make a wide use of GIS. This technology allows storage, querying, analyzing and visualizing spatial data. Several differences exist between SOLAP and GIS systems. Among them we cite: data structure, GIS can handle small amounts of very detailed, short-term data, while OLAP can handle large amount of aggregated and historical data; Contrary to GIS, SOLAP displays are interactive and generated online [30].  Since we have already successfully applied SOLAP in other application domains [5-8p], we believe that SOLAP could enhance biodiversity indicators analysis, and allow users to visualize their data. The usage of SOLAP will overcome the limits of GIS and biodiversity VGI databases due to data structure and performance.

References

  1. Bimonte, S., ChanetJ.,  Capdeville, J.,  Tailleur, A., Luciano, M. 2014. Une étude sur l’efficacité des méthodes de conception et d’implémentation pour les Entrepôts de Données par une méthodologie « requirement-based »: Cas d’étude de la consommation d’énergie en agriculture. EDA 2014, 119-128
  2. Bimonte, S. 2014 A generic geovisualization model for spatial OLAP and its implementation in a standards-based architecture. Ingénierie des Systèmes d’Information 19(5): 97-118
  3. S. BimonteO. BoucelmaO. MachabertS. Sellami, 2014. A new Spatial OLAP approach for the analysis of Volunteered Geographic Information. Computers, Environment and Urban Systems 48,  111-123 b
  4. Bimonte, S., Edoh-Alove, E., Nazih, H., Kang, M., Rizzi, S. 2013. ProtOLAP: rapid OLAP prototyping with on-demand data supply. DOLAP 2013, 61-66
  5. Bimonte, S. 2016 Current approaches, challenges and perspectives on Spatial OLAP for Agri-Environmental Analysis. Journal of Agricultural and Environmental Information Systems, 7 (4): 33-50
  6. Boulil, K., Le Ber, F., Bimonte, S., Grac, C., Cernesson, F. 2014. Multidimensional modeling and analysis of large and complex watercourse data: an OLAP-based solution. Ecological Informatics 24: 90-106
  7. Sautot, L., Faivre, B., Journaux, L., Molin, P. 2015. The hierarchical agglomerative clustering with Gower index: A methodology for automatic design of OLAP cube in ecological data processing context. Ecological Informatics 26(2): 217-230
  8. Bimonte, S., Pradel, M., Boffety, D., Tailleur, A., André, G. 2013. A New Sensor-Based Spatial OLAP Architecture Centered on an Agricultural Farm Energy-Use Diagnosis Tool. IJDSST 5(4): 1-20
  9. L. Sautot, S. BimonteL. JournauxB. Faivre: Dimension Enrichment with Factual Data During the Design of Multidimensional Models: Application to Bird BiodiversityICEIS 2015, 280-299
  10. Bimonte, S., Kang, M. 2013. WikOLAP: Integration of Wiki and OLAP Systems. Encyclopedia of Business Analytics and Optimization. IGI Global
  11. Bimonte, SAli HassanPhilippe Beaune: From Design to Visualization of Spatial OLAP Applications: A First Prototyping Methodology. ER Workshops 2016: 113-123
  12. H. Ait-Haddou, G. Camilleri, P. Zaraté 2014. Prediction of Ideas Number During a Brainstorming Session. Group Decision and Negotiation, 23(2), 271-298.
  13. P. Zaraté. Tools for Collaborative Decision-Making, Wiley, 2013.
  14. Calenge, C., Chadoeuf, J., Giraud, C., Huet, S., Julliard, R., Monestiez, P., Piffady, J., Pinaud, D., Ruette, S. 2015. The Spatial Distribution of Mustelidae in France. Plos One 10.
  15. Giraud, C., Calenge, C., Coron, C., Julliard, R. 2015. Capitalizing on opportunistic data for monitoring relative abundances of species. Biometrics
  16. Julliard, R., Jiguet, F., Couvet, D. 2004. Common birds facing global changes: what makes a species at risk? Glob. Change Biol. 10, 148–154.
  17. Levrel, H., Fontaine, B., Henry, P.-Y., Jiguet, F., Julliard, R., Kerbiriou, C., Couvet, D. 2010. Balancing state and volunteer investment in biodiversity monitoring for the implementation of CBD indicators: A French example. Ecol. Econ. 69, 1580–1586.
  18. Regnier, C., Achaz, G., Lambert, A., Cowie, R.H., Bouchet, P., Fontaine, B. 2015. Mass extinction in poorly known taxa. Natl. Acad. Sci. U. S. A. 112, 7761–7766.
  19. Chateil, C, Porcher, E. 2015. Landscape features are a better correlate of wild plant pollination than agricultural practices in an intensive cropping system. Agriculture Ecosystems & Environment, 201: 51-57.
  20. Giraud, C, Julliard, R, Porcher, E. 2013. Delimiting synchronous populations from monitoring data. Environmental and Ecological Statistics, 20: 337-352.
  21. Chateil, C., Goldringer, I, Tarallo, L, Kerbiriou, C, Le Viol, I, Ponge, JF, Salmon, S, Gachet, S, Porcher, E. 2013. Crop genetic diversity benefits farmland biodiversity in cultivated fields. Agriculture Ecosystems & Environment, 171: 25-32.
  22. Villemey A, Archaux F 2015. Mosaic of grasslands and woodlands is more effective than habitat connectivity to conserve butterflies in French farmland. Biological Conservation 191: 206–215.
  23. Filippi-Codaccioni, O., Couzi, L., Hameau, P. 2013. Distribution des micromammifères en Aquitaine – Recherche des facteurs explicatifs et cartographie. Technical report
  24. Sean L. Maxwell, Richard A. Fuller, Thomas M. Brooks, & James E. M. Watson. 2016. Biodiversity: the ravages of guns, nets and bulldozers. Nature 536:143-145.
  25. Bommarco R, Kleijn D & Potts SG. 2013. Ecological intensification: harnessing ecosystem services for food security. Trends in ecology and evolution. 28: 230-238.
  26. Prince, K., Moussus, JP, Jiguet, F. 2012. Mixed effectiveness of French agri-environment schemes for nationwide farmland bird conservation. AGRICULTURE ECOSYSTEMS & ENVIRONMENT,149  74-79
  27. L. Spinsanti, F. 2013. Ostermann.Automated geographic context analysis for volunteered information. Applied Geography (43) 36–44
  28. Altieri, M.A. 1999. The ecological role of biodiversity in agroecosystems. Agric. Ecosyst. Environ. 74, 19–31.
  29. Bédard, Y, Rivest, S., Proulx, M. 2006. Spatial on-line analytical processing (solap): Concepts, architectures, and solutions from a geomatics engineering perspective. Data Warehouses and OLAP: Concepts, Architecture, and Solutions, 298–319.
  30. Connors, J. P., Lei, S., Kelly, M. 2012. Citizen science in the age of neogeography: Utilizing volunteered geographic information for environmental monitoring. Annals of the Association of American Geographers, 102(6), 1267-1289
  31. Corr, L. Stagnitto, J., 2011. Agile Data Warehouse Design: Collaborative Dimensional Modeling, from Whiteboard to Star Schema. DecisionOne Press
  32. Cravero, A., Sepúlveda, S., 2014. Multidimensional design paradigms for data warehouses: a systematic mapping study. Journal of Software Engineering and Applications, 7(1), 53-63
  33. Desanctis A., Gallupe R. 1987. A Foundation for the study of group decision support systems. Management Science,33(5),589-609.
  34. Devictor, V., Julliard, R., Jiguet, F. 2008. Distribution of specialist and generalist species along spatial gradients of habitat disturbance and fragmentation. Oikos 117, 507–514.
  35. Feick, R., Roche. S. 2012. Understanding the Value of VGI. Crowdsourcing Geographic Knowledge: Volunteered Geographic Information in Theory and Practice.
  36. Goodchild, M. 2007. Citizens as sensors: the world of volunteered geography. GeoJournal, 69, 211–221.
  37. Golfarelli, M., Mantovani, M., Ravaldi, F. 2013. Lily: A Geo-Enhanced Library for Location Intelligence. DaWaK 2013, 72-83
  38. S. Roche, B. MericskayW. BatitaM. BachM. Rondeau. 2012 WikiGIS Basic Concepts: Web 2.0 for Geospatial Collaboratiin. Future Internet 4(1), 265-284
  39. Fithian, W., Elith, J., Hastie, T., Keith, D.A. 2015. Bias correction in species distribution models: pooling survey and collection data for multiple species. Methods Ecol. Evol. 6, 424–438.
  40. Isaac, N.J., van Strien, A.J., August, T.A., de Zeeuw, M.P., Roy, D.B. 2014. Extracting robust trends in species’ distributions from unstructured opportunistic data: a comparison of methods. bioRxiv.
  41. MacKenzie, D.I., Nichols, J.D., Royle, J.A., Pollock, K.H., Bailey, L., Hines, J.E. 2005. Occupancy Estimation and Modeling: Inferring Patterns and Dynamics of Species Occurrence.
  42. Miller, D.A., Nichols, J.D., McClintock, B.T., Grant, E.H.C., Bailey, L.L., Weir, L.A. 2011. Improving occupancy estimation when two types of observational error occur: non-detection and species misidentification. Ecology 92, 1422–1428.
  43. Pagel, J., Anderson, B.J., O’Hara, R.B., Cramer, W., Fox, R., Jeltsch, F., Roy, D.B., Thomas, C.D., Schurr, F.M. 2014. Quantifying range-wide variation in population trends from local abundance surveys and widespread opportunistic occurrence records. Methods Ecol. Evol. 5, 751–760.
  44. Romero, O., Abelló, A. 2009. A Survey of Multidimensional Modeling Methodologies. IJDWM, 5(2), 1-23.
  45. MacEachren, A., Gahegan, M., Pike, W.: Geovisualization for knowledge construction and decision support. IEEE Comput. Graph. Appl. 24(1), 13–17 (2004)
  46. N. Stefanovic, J. Han, and K. Koperski. Object-based selective materialization for efficient implementation of spatial data cubes. IEEE Trans. on Knowl. and Data Eng., 12(6):938–958, 2000.
  47. Brisson N, Gate P, Gouache D, Charmet G, Oury F-X et al. (2010) Why are wheat yields stagnating in Europe? A comprehensive data analysis for France. Field Crops Res 119 (1):201-212
  48. Williams, PH; Margules, CR; Hilbert, DW JOURNAL OF BIOSCIENCES   Volume: 27   Issue: 4   Supplement: 2   p: 327-338 , 2002
  49. F. Di Tria, E. LefonsF. Tangorra: Design process for Big Data Warehouses. DSAA 2014: 512-518
  50. Adla, P. Zaraté, J. Soubie. 2011. A Proposal of Toolkit for GDSS Facilitators. Group Decision and Negotiation, 20(1), 57-77
  51. Lihui Wang, Weiming ShenHelen XieJoseph NeelamkavilAjit Pardasani:
    Collaborative conceptual design – state of the art and future trends. Computer-Aided Design 34(13): 981-996 (2002)
  52. Microsoft SOLAP. https://www.youtube.com/watch?v=4xJMuJE28oA. Visited on 7/3/2017
  53. Grima, C. Rios Insua D.: A Generic System for Remote e-Voting Management. In Rios Insua, D. and French, S. (Eds) Advances in Group Decision and Negotiation 5 – e-Democracy. Springer. Pp 223-240 (2010)
  54. Alfaro, C., Gomez, J., Rios, J.: From participatory to e-Participatory Budgets. In Rios Insua, D. and French, S. (Eds) Advances in Group Decision and Negotiation 5 – e-Democracy. Springer. Pp 283-300 (2010)
  55. Smoliar, S., Sprague, R.: Communication and Understanding for Decision Support. Proceedings of the International Conference IFIP TC8/WG8.3, Cork, Ireland, pp. 107-119 (2002).
  56. Chen, L., K. S. Soliman, E. Mao, et M. N. Frolick (2000). Measuring user satisfaction with data warehouses : an exploratory study. Information & Management 37(3), 103–110.

1p.         MESOLAP –  University of Alicante;  TIN2008-03863 – University of Bacelona; Tropos project[;  H2020 Agile Analytics on Big Data Cubes;  

2p.         ANRT Cifre -IRIT with Capgemini;  Rhone Alpes Region’s ARC 5 project;  ANR CAIR;, ANRT Cifre with EDF and  IENSAM;  ANR Aggreg;

3p.         ANR LEDEPAGOD; ANR FITOTIC; ANR ECOPACK

4p.         ANR HEADWORK; projet CNRS Mastodons; GDR Ecostats Projet 

5p. CASDAR EDEN

6p. Carnot VGOLAP

7p. ANR Fresqueau

8p.European program Life  Plan ‘Loire grandeur nature’ STORI

9p. ANR CAPTIVEN

10p. ANR “65 millions d’observateurs”

11p H2020 RUCAPS

12p. National program Vigienature