Category: Corpus Linguistics

2006-10-30

PermalinkPermalink 16:34:22, by Ingrid Tieken - Boon Van Ostade Email , 335 words   English (EU)
Categories: Corpus Linguistics

The Proceedings of the Old Bailey as a linguistic corpus

One of the suggestions given after I delivered my paper on the history of multiple negation at the Perspectives on Prescriptivism syposium in Ragusa earlier this year was that I might find some useful spoken language data in this database. So I decided to spent some time searching the Old Bailey records. At first sight it looks like a wonderful resource, but I soon discovered a number of problems. To begin with, the database is not searchable for high-frequency words such as no or not. Neither proved less frequent, which allowed me to search for the kind of construction that I know is still regularly used in the eighteenth century. I thus found instances like: "but not so drunk neither" (1727) and "but the Money was not ready then neither" (1733). One instance was particularly useful, as the speaker identified himself as a servant (this was the kind of information I was actually looking for), and it is moreover also possible to identify the sex of the speaker. But other than that I found there was little I could do with the information found. There do seem to be increasing numbers of instances as the century progresses, but there is no way I can relate absolute numbers to amount of text. I would say therefore that the database is of limited use, other than to discover that a particular form or construction is indeed used in reported speech, by men as well as women and by people accused of having committed a crime (which does not of course assign them to any particular social class).

But I also found that parts of the text were scanned but not subsequently corrected, so that long <s> at times occurs as <f>, and nor as not. I have the impression that things got better the further I progressed into the eighteenth century, but it is worrying all the same.

I'd be interested in learning about other people's experiences with this databse.

2005-11-29

PermalinkPermalink 14:13:56, by Karlijn Navest Email , 19 words   English (EU)
Categories: Corpus Linguistics

Eighteenth-century Corpora

I hope somebody can answer the following question:

How many English (historical) corpora are there at the moment?

Quote of the month

"All the pains we bestow upon a language, when it is sufficiently perfect for all the uses of it, serve only to disfigure it, to lessen its real value, and incumber it with useless rules and refinements, which embarrass the speaker or writer."

(Joseph Priestley, A Course of Lectures on the Theory of Language and Universal Grammar. 1762.)

Witticisms and strokes of humour

A poor Fellow condemned told the late Justice Burnet it was very hard to be hang’d for stealing a Horse. “No, Friend”, said the Judge: “you are not hang’d for stealing a Horse; but that Horses may not be stolen."

(Robert Baker, Witticisms and strokes of humour. 1766: 50)

Search

XML Feeds

blogging tool