A common challenge faced by organizations in a world where information systems are changing rapidly is a large amount of data that is not easily accessible. The Global Environment Facility (GEF), is no different, but we’ve taken important strides to address the challenge. We have years of excel files, PDFs, and CDs (!) of data squirreled away here and there – invaluable historical data which are often very difficult to access and to consistently use for planning and decision-making.
Established on the eve of the Rio Earth Summit in 1992, the GEF has invested over USD 15 billion over the past 25 years in protecting the global environment. As a partnership, financial mechanism for environmental conventions, innovator and catalyst, the IT infrastructure for the GEF was primarily established to support the management of financial information.
Through the years, necessary reporting requirements and a database have been put in place as the need to monitor and report on various aspects of the growing GEF portfolio evolved.
The data mining process
The GEF Secretariat spent a year systematically harvesting data from 100,000 data assets, and compiled this information from the documents into a composite usable database. Then we used used tools like Tableau, Excel, and Sharepoint to help GEF staff make better use of it. Our Results Based Management team also developed new ways of presenting the data in a more user-friendly way; as a result, the first GEF Corporate Scorecard was published in May this year.
We’ve got a way to go, but what we learned from the exercise is that the systems, and technical solutions we had in-house were, if used in a clever way, sufficient to support this large undertaking.
Modernizing our data systems brings about new opportunities as well as new challenges. Having all the data available in a composite database allows many aggregate level analyses - each requiring new technical and organizational solutions.
To analyze aggregate data, we need robust visualization options. Environmental data of this nature makes large geospatial analysis possible, again requiring new technical solutions and support. A modern data system also makes more effective reporting possible.
Some of the examples below on how we approached this exercise in GEF might spark ideas on how you can approach similar challenges in your organization.
Extracting quantitative information
Project results data is available to GEF through evaluation documents, project reviews and project information documents and are a mix of quantitative and qualitative information.
Excel proved to be a surprisingly versatile tool in handling all the documents with quantitative information. Excel and its in-built Visual Basic Macro feature were used to automate the extraction of 80% of quantitative data points from reports.
Transforming quantitative data into information
Document formats that were not conducive to automatic extraction, like pdfs and documents, were extracted manually. Available features in Microsoft Sharepoint proved adequate to allow the qualitative content of close to 100,000 documents to be searched. This enables many thematic analyses that would not have been possible before.
The project results data is available to GEF through evaluation documents, project reviews and project information documents and are a mix of quantitative and qualitative information.
Interpreting the data
Tableau is a user-friendly visual data exploration tool, which helps us build better hypotheses. The ease of use of a tool like Tableau has proved helpful in making data exploration, analysis and interpretation accessible for everyone.
Streamlined access to historical data enables better and more detailed reporting from GEF. One such report, the GEF Corporate Scorecard provides a snapshot of GEF-wide efficiency and effectiveness, in a reader-friendly format. The underlying analyses and graphs were all done using Excel.
Harnessing a wealth of open source spatial data
Rapid advances in geospatial and remote sensing technologies have enabled many open source data platforms and datasets. GEF is working to incorporate these into our data management system to allow robust analyses at the local, regional and global level. Existing platforms like Spatial Agent is being leveraged to support this. Geospatial information can provide key information to augment and support decisions both during project inception and project implementation.
In conclusion, we would like to say that harvesting a data set like ours was a huge undertaking but an undoubtedly rewarding exercise. The surprising take away from this large data extraction exercise was that commonly available software like Excel and Sharepoint can be used to take on very complex tasks with minimal training.
People involved with the Results Based Management team and the IT team as part of this work include Juman Byun, Ying Jiang, Deepak K. Kataria, Rahul Madhusudanan, Jessie Mee, Svetlana I. Negroustoueva, Omid Parhizkar, Andon Plamenov Pavlov, Naiying Peng, Christine Roehrer, Hanna Karima Schweitzer, and Sonja Sabita Teelucksingh.