The proceedings of Australia's Commonwealth Parliament are recorded in Hansard, which is available online through the Parliamentary Library's ParlInfo database. This repository includes Jupyter notebooks to harvest and explore XML formatted versions of Hansard.
You can access a full harvest of the XML files for both houses between 1901 and 1980 from this repository.
The XML files are made available on the Australian Parliament website under a CC-BY-NC-ND licence.
Tools, tips, and examples¶
Results in ParlInfo are generated from well-structured XML files which can be downloaded individually from the web interface – one XML file for each sitting day. This notebook shows you how to download all the XML files for large scale analysis. It's an updated version of the code I used to harvest Hansard in 2016.