This is the summary of the projects semanticClimate is doing.
This will give the idea about how to start with any climate report published in pdf that is not machine readable and also not in semantic form.
Use html_with_ids.html for all the chapters in different working groups.
Here are the steps to create dictionary from the words.
pip install amilib
amilib DICT --words list_of_words.txt --description wikipedia --dict output_dict.html --figures --operation create
This will create dictionary in html enriched with information from Wikipedia with figures related to the term.
In the Colab notebook, the wordlist in txt file is uploaded from the local system and then the dictionary is created from code mentioned in the code cell of the note book. All the steps were given in the Colab notebook for easy usage of the tool amilib
The different chapters from the IPCC contains many words that are difficult to understand while reading the report. So, the dictionary has been created for those words that are enriched with Wikipedia information alongwith the figures. This dictionary has been used to annotate chapters. Annotated report makes it easily understandable to any group of the people.
The steps has been demonstrated in the colab notebook.