Introduction
Synopsis
Originally conceived of as a complex digital object comprising audio clips from field dialect recordings coordinated with text files containing analysis on several levels, the Bulgarian Dialectology as Living Tradition project is now being prepared as an interactive database. The central focus remains the collection of interviews, each of which is available both as an audio file and in text format. The texts are currently being transcribed, translated, annotated, and entered into the database. When data entry is completed, tags at both the token level (for linguistic features) and the text level (for discourse and content features) will allow users to extract and compare data on many more levels than has previously been possible.
Structure of the project
The database comprises 150 audio clips, excerpted from recordings made by the authors in 60 different Bulgarian villages over a span of 15 years. All recordings are of speech in natural conversational settings, and each excerpt is chosen with both form and content in mind. In the first instance, the excerpt illustrates the most salient linguistic features of particular dialect, and in the second instance, the excerpt is a well-formed piece of discourse communicating some aspect of village life or of the speaker’s worldview. Wherever possible, excerpts have also been chosen which illustrate methods of field dialectology. Data entry for each excerpt consists of:
- a line-by-line transcription of the text with interlinear translation (“Line view”)
- the same, but with grammatical glosses under each word or token (“Token view”)
- at the token level, identification of the meaning, basic grammatical attributes, and salient
- at the text level, identification of thematic content (by topic), discourse elements, and narrative style or genre
- identification of the gender and age of informants, and the date and location of recording
Database advantages
- all locations represented are immediately visible as pins on a Google map
- the database can be browsed both for the selected linguistic traits and grammatical attributes of individual tokens, as well as for thematic content and discourse elements of individual texts
- all attested representations of any one standard Bulgarian lexeme can be seen at a glance
- the location within a text of any one browsed element or trait can be precisely identified, and the user can then directly access either the text or the audio file in question
Goals of the project
- illustrate the diversity of Bulgarian dialects in a more vivid and realistic manner than is currently possible with dialect atlases
- illustrate two important “living traditions” in Bulgaria: that of village life as it maintains its inheritance from the past, and that of Bulgarian dialectology as it documents village speech in its living context
- allow linguistic analysis at a level broader than the lexical or phonetic, by using longer speech samples as the base data, and by cataloguing elements of discourse
- allow comparative access to ethnographic material within the text samples
- ensure a solid representative network of sites, covering all basic Bulgarian dialect types
- allow users direct access to the data, both by making the primary audio files available, and by making the identification of traits and attributes as transparent as possible