Understanding state preferences with text as data: Introducing the UN General Debate corpus

Research output: Contribution to journalArticlepeer-review


Colleges, School and Institutes

External organisations

  • Hertie School of Governance


Every year at the United Nations (UN), member states deliver statements during the General Debate (GD) discussing major issues in world politics. These speeches provide invaluable information on governments’ perspectives and preferences on a wide range of issues, but have largely been overlooked in the study of international politics. This paper introduces a new dataset consisting of over 7300 country statements from 1970–2014. We demonstrate how the UN GD corpus (UNGDC) can be used as a resource from which country positions on different policy dimensions can be derived using text analytic methods. The article provides applications of these estimates, demonstrating the contribution the UNGDC can make to the study of international politics.


Original languageEnglish
Pages (from-to)1-9
Number of pages9
JournalResearch and Politics
Issue number2
Publication statusPublished - 1 Apr 2017


  • Policy preferences, foreign policy, United Nations, text as data