Skip to content

Press releases relating to COVID

Trove includes more than 380,000 press releases, speeches, and interview transcripts issued by Australian federal politicians and saved by the Parliamentary Library. This dataset contains metadata and full text of items from the press releases collection that include the term 'covid' or 'coronavirus'.

The dataset was created in two stages. First, Trove was searched for relevant resources, and metadata from each search result was saved in a CSV file. Then links to ParlInfo contained within the metadata were used to download the full text.

Known issues:

  • some items had no downloadable text available in Parlinfo, so the number of text files is less than the number of metadata records
  • some items have different metadata but the same text, for example when a press release is issued by both a political party and an individual member
  • because of the way Trove groups records, it's possible some of the records don't include any of the search terms.

Download complete dataset as zip Explore in Datasette

Files

text

date harvested2024-02-25
formatdirectory
number of files5,520
licenseCopyright Not Evaluated

Contains individual text files, each containing the contents of an item from the press releases collection. Files are named using the following metadata fields: date, contributor, and version_id. For example: 1960-01-22-casey-richard-213589278.txt.

Download from GitHub

results.csv

date harvested2024-02-25
formattext/csv
file size5.2 MB
number of rows5,523
licenseCC0 Public Domain Dedication

Download from GitHub

Columns

name type description
title string title of the press release, interview, or speech
contributor string names of people and organisations contributing to this item; multiple values separated by
date date date when this item was presented or published; ISO format 'YYYY-MM-DD'
description string usually includes the beginning of the full text content (truncated at approx 200 characters)
type string type of resource, eg 'Press Release' or 'Speech'; multiple values separated by
format string format of resource, usually either 'Online Text' or empty; multiple values separated by
work_type string Trove work format, eg 'Article'; multiple values separated by
language string language of resource, eg 'eng'; multiple values separated by
extent string often includes the number of pages; multiple values separated by
rights string information about copyright; multiple values separated by
subject string subject or topic headings; multiple values separated by
is_part_of string collection containing this item; multiple values separated by
fulltext_url string Link to the full text version in ParlInfo
trove_url string link to the version record in Trove
work_id integer Trove work identifier
version_id integer Trove version identifier
hash string SHA-1 hash value generated from the full text; useful for identifying duplicate texts

Context of creation

date harvested2024-02-25
notebookHarvest parliament press releases from Trove
querynuc:"APAR:PR" AND (covid OR coronavirus)

Getting help