
Climate Knowledge Graph: Project Kickoff
diesen Beitrag auf Deutsch lesen
Climate Knowledge Graph is an R&D project with the goal of making climate change knowledge held in the IPCC Sixth Assessment Report more accessible using open science methods and FAIR data principles. At the core of the project is the creation of a knowledge graph which will support search and publishing, data analysis, and AI LLM use. The aim of the project is to service expert and citizen science communities, support multilingualism, and enable global access.
Figure: Climate Knowledge Graph schematic
The Intergovernmental Panel on Climate Change (IPCC) provides the most authoritative reports on climate change knowledge. The IPCC reports are intended to inform policy makers and governments in negotiating global collective action. The IPCC Sixth Assessment Report (AR6) has been described by UN Secretary-General António Guterres, “a survival guide for humanity. As it shows, the 1.5-degree limit is achievable. But it will take a quantum leap in climate action.”
AR6 is a large unstructured corpus with many parts residing in different locations on the web which are not connecting in a machine readable way. Provisional publication statistics have been compiled by team Climate Knowledge Graph (CKG) in what would appear to be the first full quantification of the AR6 report. The report has 1,106 authors listed, the main English language version is made up of over 10,000 pages as PDF comprising of over 8 million words. The report is also available in the six official languages of the UN with a small inclusion of other languages. The text is supported by 1,672 figures, 48,400 references, and 66,834 data entries.
To delve into the workings of the IPCC and as a model of a scientific organisation that provides knowledge that informs negotiations on global issues two literature example are very helpful: Raphael Slade et al., paper on improving working group practice ‘Back to Basics for the IPCC: Applying Lessons from AR6 to the Seventh Assessment Cycle’; and Hannah Hughes’ Open Access book ‘The IPCC and the Politics of Writing Climate Change’ provides a forensic breakdown of the endeavour.
Figure: AR6 soure data model
Using a knowledge graph all of these parts can be identified, located, connected, and their relations mapped out. The CKG project will iteratively catalogue the parts listed and in the first round only include the English language text, authors, citations, and glossary and acronyms.
CKGs initial phase is for one year from July 2025 and is funded by the Innovationsfond from TIB – Leibniz Information Centre for Science and Technology and University Library. CKG is led by Simon Worthington and is organised by two TIB labs – the Open Science Lab and Lab Knowledge Infrastructures (LKI) with external partners the open research group #semanticClimate and the National Institute of Plant Genome Research (NIPGR), New Delhi. The CKG project originated with #semanticClimate which was founded six years ago and is active on a daily basis developing semantic data analysis software as a community, with NIPGR supporting an India wide internship programme, hackathon series, and youth outreach programme.
The #semanticClimate group has laid the groundwork for CKG by providing the full AR6 corpus as HTML with IDs and extracting the AR6 glossary as structured data, both on GitHub with the glossary already imported into a Wikibase, as well as providing a software tooling for data mining and analysis.
CKG will build its initial knowledge graph for AR6 using the Wikibase platform and from there data can be distributed to other databases and accept community contributions as needed. The roadmap for CKG starts with the data provided by #semanticClimate, then initially Wikibase is used for the first stages of transforming an unstructured corpus to a structured FAIR data publication corpus. This stage uses know-how and software developed at TIB OSL Wikibase Data Services.
as part of NFDI4Culture research, the services: Wikibase4Research, Antelope (terminology services), and Computational Publishing Services (CPS). Alongside having the main knowledge graph in Wikibase, CKG is supported by TIB knowledge graph experts from Open Research Knowledge Graph (ORKG) and LKI.
Roadmap and tech stack
Phase | Tech Stack | Status |
Harvest | CPS, #semanticClimate | |
Structure | Wikibase, RDF, #semanticClimate | We are here (Sept ‘25) |
Annotation | #semanticClimate, Antelope, ORKG, ORKG reborn | |
Distribution | Wikidata, ORKG, ORKG reborn | |
Publishing | CPS | |
Data access & analysis | #semanticClimate, Wikibase |
Table: CKG is methods all use open science, open source, and digital sovereign infrastructures. The CKG structuring FAIR data workflow can be used for structuring any corpus for knowledge graph creation.
CKG welcomes partnership and collaboration as the scale of supporting such an important and large corpus, as well as the potential for many uses by diverse communities is beyond the scope of the CKG project. Even connecting the large data sets is out of scope in the first phase. Our focus is on what is an initial scoping and development phase and sustainably for making the report knowledge graph available. As example of different uses our partners LKI and #semanticClimate are already working on extended uses of the corpus: LKI has a method for authoring machine-reusable documents from the outset where data referenced is available as FAIR data direct from a publication called ORKG reborn, and #semanticClimate carries out community annotation and linking to the global knowledge graph Wikidata which can be used for data analysis.
Climate Knowledge Graph documentation website: https://tibhannover.github.io/climate-knowledge-graph/
Contact: Simon Worthington, simon.worthington@tib.eu
References
Slade, Raphael, Minal Pathak, Sarah Connors, Melinda Tignor, Andrew Emmanuel Okem, and Noëmie Leprince-Ringuet. ‘Back to Basics for the IPCC: Applying Lessons from AR6 to the Seventh Assessment Cycle’. Npj Climate Action 3, no. 1 (2024): 48. https://doi.org/10.1038/s44168-024-00130-4.
Hughes, Hannah. The IPCC and the Politics of Writing Climate Change. Cambridge: Cambridge University Press, 2024. https://doi.org/10.1017/9781009341554.
Climate Knowledge Graph – literature
Worthington, S., Yadav, G., Murray-Rust, P., Kumari, R., Hegde, S. & Bhadra, P., (2025) “Climate Justice in Electronic Publishing: A New Approach Supporting Global South Participation”, The Journal of Electronic Publishing 28(2). doi: https://doi.org/10.3998/jep.7206
Worthington, Simon, Gitanjali Yadav, Shweata Hegde, Renu Kumari, Neeraj Kumari, and Peter Murray-Rust. 2024. ‘The #SemanticClimate Community: Making Open-Source Software for Knowledge Liberation’. Annals of Library and Information Studies 71 (4): 480–95. https://doi.org/10.56042/alis.v71i4.14285
Featured image: Figure 1.5 | A climate-resilient and equitable world requires limiting global warming while achieving the Sustainable Development Goals (SDGs). Source: IPCC (2018b). https://www.ipcc.ch/report/ar6/wg3/figures/chapter-1/figure-1-5/
#LizenzCCBY40INT #climateChange #ipcc #Wikibase #knowledgeGraph
Back to basics for the IPCC: applying lessons from AR6 to the Seventh Assessment Cycle