Rendered below is the Colab Notebook we prepared for the Hackathon. Scroll down to see the results.
You'll find the code snippets and commands in the embedded Notebook. Here are some interesting tidbits.
py4ami
First step in making IPCC Reports semantic is to convert dumb PDF needs to HTML. py4ami
does the job for us. Here's the preview of converted HTML.
docanalysis
and write to an ami-dictionarydocanalysis
automatically extracts abbreviations, their full forms, and potential Wikidata IDs.
<entry name="VRE" term="Variable Renewable Energy" wikidataID="['//www.wikidata.org/wiki/Q7915732']"/>
<entry name="SDGs" term="sustainable development goals" wikidataID="['//www.wikidata.org/wiki/Q7649586']"/>
<entry name="TPES" term="total primary energy supply" wikidataID="[]"/>
<entry name="TFC" term="total final energy consumption" wikidataID="[]"/>
<entry name="CSP" term="Concentrating solar power" wikidataID="[]"/>
<entry name="LIBs" term="lithium-ion batteries" wikidataID="['//www.wikidata.org/wiki/Q106988181']"/>
As an example, www.wikidata.org/wiki/Q7915732 takes you to the Wikidata pages that tells you all about Variable Renewable Energy. The Wikidata page also points you to the Wikipedia page: https://en.wikipedia.org/wiki/Variable_renewable_energy.
Worcloud generated based on number of hits for terms in the dictionaries.
pyami
, using abbreviation and other climate-related dictionary, can annotate the HTML version of IPCC Reports. Click here or check out the preview below.