Corpus of Founding Era American English (COFEA)

Current Status:  

Version 3.00 was built 4 February 2019.

It includes corrections of OCR errors and adjusted word counts.

Current sources include 119,801 texts from three sources for a total of 133,488,113 words.

Source

Documents

Words

Evans Early American Imprints

2,645

62,660,171

Founders Online

115,408

37,057,114

HeinOnline

277

32,237,273

Farrands

847

689,755

United States Statutes at Large

479

470,345

Elliots

145

373,455

Totals

119,801

133,488,113

 

Version 2.1

Current sources include 95,133 texts from three sources for a total of 138,892,619 words.

 The Initial Three Sources are:

Founders Online (https://founders.archives.gov/) over 90,000 records (mostly personal records, letters, diaries, etc. ) from the National Archives.

Broken Down by individual words, the Founders Online we are using represent the following founders.

Author Words
Washington Papers 12,044,694
Adams Papers 7,274,489
Hamilton Papers 3,895,699
Franklin Papers 2,578,518
Jefferson Papers 1,726,603
Madison Papers 119,680

HeinOnline (The largest legal publisher in the United States)

Around 300 records.  These are mostly session laws, executive department reports, and legal treatises.  For the most recent title list click here.

Evans Bibliography of Early American Imprints covering the time frame of 1760 to 1799.  For the most recent title list click here.  Around 3000 texts from Evan’s work American bibliography : a chronological dictionary of all books, pamphlets and periodical publications printed in the United States of America from the genesis of printing in 1639 down to and including the year 1820 ;with bibliographical and biographical notes.  We were given t a third of Evans available and about half of that was within our time frame.  It was shared with us by the University of Michigan’s Text Creation Project (TCP).

Goal: Develop large balanced corpus of English language materials available between 1760 and 1799.

Background:

COFEA was initial conceptualized by James Phillips, in 2015 while he as a visiting professor at BYU Law School.

It covers the time period starting with the reign of King George III, and ending with the death of George Washington (1760-1799), making it the oldest historical corpus of American English, and the possibly the first in existence for that time period.

 

Constitution

David Armond

Head of Infrastructure & Technology

BYU Law
Sara White

Corpus Linguistics Fellow

Profile