With U.S. President-elect Donald Trump selecting several climate-change deniers to his cabinet – new Environmental Protection Agency head Scott Pruitt and Rex Tillerson as Secretary of State – some scientists fear environmental data now available online could disappear.
University of Toronto faculty, librarians and students are working with the Internet Archive's End of Term project to help preserve at-risk U.S. government websites with plans to capture crucial scientific and environmental websites before they vanish.
This push to preserve information before the incoming Trump administration takes control of U.S. federal agencies has spurred a full-day “Guerrilla Archiving Event: Saving Environmental Data from Trump” hackathon scheduled to take place Dec. 17 at the Faculty of Information. Participants will flag information for the non-profit organization Internet Archive, which archives Internet pages and makes them publicly available.
The hackathon is being organized by the Faculty of Information's Patrick Keilty, an assistant professor and a member of U of T’s Technoscience Salon and Research Unit, History and Women and Gender Studies Professor Michelle Murphy and Matt Price, an instructor in the Faculty of Arts & Science and the Faculty of Information. Sam-chin Li is a U of T Government Publications and Reference Librarian at Robarts Library. Her team archived the Aboriginal Canada Portal site before the Canadian government shut it down.
Their work is making headlines here at home and around the world:
Writer Kathleen O'Brien spoke with them about capturing online information.
Michelle Murphy, Matt Price and Patrick Keilty are part of the Faculty of Information (photo by Kathleen O'Brien)
The Canadian government's program to print government publications ended in 2014, yet no ministry is responsible to archive government websites. How have people at U of T become involved in preserving government websites? What can U of T teach our neighbours to the south? Why is this role important, especially at this time?
Murphy: Our event welcomes anyone with research and technical skills to join us. Not only are there people with needed research and tech skills at U of T but in Toronto generally. We know this is important to do because of the lessons learned from what happened to environmental research and public access to data during the Harper administration here in Canada.
The Trump transition team for the U.S. Environmental Protection Agency has been explicit about goals to dismantle programs and have, at times, taken public positions against evidence-based environmental policy so we expect to see similar practices of making data less accessible in the U.S. Our effort is networked with colleagues planning similar events in the U.S. in Philadelphia and New York City. So we are in collaboration with U.S.-based colleagues who are also taking action. Our atmosphere, water ways and lives are connected across the continent. What happens to the environment and climate change in the U.S. will affect us all.
Li: Library and Archives (LAC) stopped their web archiving of Canadian Government web sites around November 2007 and only relaunched the program after lobbying by the academic community in October 2013. U of T stepped in to archive the Aboriginal Canada Portal and a lot of other federal sites from December 2007 to 2015 to fill the gaps. Collaboration is important in archiving government web sites as many of them are huge and need more people to dig deep for at-risk materials. We put out a call for librarians to nominate resources to archive when we learned that about 1,500 websites would be reduced to eventually one government website.
Will you help preserve online information in other vulnerable areas (religious freedoms, human rights, reproductive rights, gun statistics)?
Keilty: Our event focuses on environmental websites and data. This is a priority area because the Trump transition team has consistently identified these programs as ones to cut immediately.
We have a few goals we hope to achieve at our event. First, we need to identify vulnerable programs and then seed their URLs to the webcrawler of the End of Term project, which will make copies of those webpages. Second, we are researching and evaluating the many data repositories that the EPA has online: some of this data we know will be backed up and protected by laws, some data will be archivable at the Internet Archive through their webcrawler, and yet other sources of data will need to identified as in need of saving at a library. Libraries, such as at the University of Pennsylvania, are arranging to become repositories of this kind of vulnerable data not easily preserved. We will be passing on what we build and research to our colleagues in other cities so that they can pick up where we have left off.
Li: Yes. University of Toronto Library has many web-archived collections including of Canadian political parties and interest groups for more than 10 years. Other collections include: federal election candidate sites from 2015, politics in Hong Kong, the Toronto mayoral election in 2014 and many others.
Which U.S. agencies do you believe are vulnerable and should be archived?
Price: Assessing the answer to this question is one of the tasks of our event. It is not whole agencies that are most at risk, but programs in them. Here, it is important to research what the Trump transition team has identified as programs and regulations they want to dismantle.
From there, we work back to the websites and publicly available data sources associated with those programs. However, there is data that is not publicly available, and so this is something that colleagues in the U.S. are thinking through how to access and preserve. There is a time urgency to the project of preserving publicly available data. While the new administration may not directly delete data, it takes funding, personnel and effort to maintain publicly available data. So one of the ways data may become less accessible is by starving its maintenance of resources.
Li: I think the Environmental Protection Agency is vulnerable and should be archived.
How will you make information gleaned available and accessible to the public?
Murphy: The End of Term project has a webcrawler that they are using to make copies of .gov webpages, and these pages will be stored and made publicly available on the Internet Archive website. On a technical level, our event is helping the webcrawler get to materials it might not otherwise collect and also to assist the webcrawler in making certain websites a priority for saving. If people are interested in this project, and cannot make our event, they can nominate URLS to the End of Term project directly.
Li: Members of the public can access our web-archive collections.