Congressional Record for the 43rd-114th Congresses: Parsed Speeches and Phrase Counts
This dataset contains processed text from the bound and daily editions of the United States Congressional Record, as provided by HeinOnline. The bound edition covers the 43rd through the 111th Congresses, and the daily edition covers the 97th through the 114th. Each edition includes all the text spoken on the floor of each chamber of Congress: the United States House of Representatives and the United States Senate. An automated script parses the text of each session to produce full-text speeches, metadata on the speeches and their speakers, and counts of two-word phrases (bigrams) by speaker and participant. Text is aggregated across sessions to flag bigrams related to congressional procedure or are extremely common or rare. Also included are the results of a manual audit of the script and statistics about our rate of speech matching with members of Congress.
Organization
Stanford University
Temporal coverage
2000 - 2017
® 2025 Data Basis