US Library of Congress to stop collecting each and every tweet

27 Dec 2017

For the last several years, the Library of Congress has archived nearly every tweet ever made. But this practice will soon come to an end, as the Library of Congress announced Tuesday that it would end its blanket collection of tweets at the beginning of the new year.

In 2010, when Twitter was not as omnipresent as it is now, the Library of Congress and Twitter agreed to create an archive of all publicly available tweets to capture "the emergence of online social media".

In a press release, the library said Tuesday that "the nature of Twitter has changed" and cited Twitter's new 280-character limit, an increased frequency of tweeting, and the rise of non-text tweets to conclude that nearly 12 years of tweets - from Twitter's inception in 2006 until the end of 2017 - is more than enough for future scholars to pore through.

The library is now only interested in tweets with "event-based" merit or tweets related to "themes of ongoing national interest". Tweets about your annoying little brother or what you had for lunch are thus no longer fit for the nation's digital archives, as The Week puts it.

Moreover, the Library of Congress only takes text-based tweets, so posts like President Trump's recent retweet of a bloodied CNN logo splattered on his shoe don't make the cut.

When it started, the very notion that the research arm of the United States Congress - which also happens to be the country's oldest federal cultural institution and the largest library in the world - had interest in the public prattle of people publishing thoughts and links in real time, at 140 characters at a time, seemed far-fetched. But the deal was done. With help from Twitter itself, the institution acquired all public tweet text (including by countless members of Congress and several US presidents) published between 2006 and 2010 and a promise to do the same in the years to come.

Twitter has since exploded in size (the company went public in November 2013) and so the volume of so-called tweet text flooding into the library's digital archives was growing exponentially, eventually becoming too much.

Beginning 1 January, The Library of Congress will ''acquire tweets on a selective basis''. It is the same policy the institution applies to websites, which have naturally exploded in number since they first appeared decades ago.

''The Library [chose to collect tweets] for the same reason it collects other materials - to acquire and preserve a record of knowledge and creativity for Congress and the American people. The initiative was bold and celebrated among research communities,'' wrote Library of Congress spokeswoman Gayle Osterberg on its website. ''In the years since, the social media landscape has changed significantly, with new platforms, an explosion in use, terms of service and functionality shifting frequently and lessons learned about privacy and other concerns. The Library now has a secure collection of tweet text, documenting the first 12 years (2006-2017) of this dynamic communications channel-its emergence, its applications and its evolution.''

The Library says it plans to continue to preserve and secure its collection of tweet text. The collection as a whole will remain under embargo ''until access issues can be resolved in a cost-effective and sustainable manner''.

''The Twitter Archive may prove to be one of this generation's most significant legacies to future generations,'' the institution wrote in a white paper announcing the policy change. ''Future generations will learn much about this rich period in our history, the information flows, and social and political forces that help define the current generation.''