The Long Term Documentation Initiative
documentation initiative pursues comprehensive
multimedia documentation of the endangered language and cultural heritage of
Eastern Khanty, a native Siberian Uralic language.
The language of the Khanty (a.k.a. Ostyak)
forms, together with Mansi, the Ob'-Ugric subgroup of the Finno-Ugric group
of the Uralic language family.
Though often considered a single
language, Khanty is a group of dialect clusters,
or to many, including Khanty themselves, languages
(western and eastern). The dialects of interest in this study are a
continuum of related river dialects of
and Vakh, and
Yugan, are particularly
interesting as they represent a reportedly more archaic and rich system in
morphosyntactic terms. The total number of speakers of this very
diverse, linguistically prominent and extremely under-documented dialects is
rapidly decreasing to under 500, principally placing these dialects in the
group of languages in the imminent danger of extinction within a single
generation – moribund (Krauss,
dialects demonstrate numerous features that attract considerable typological
and sociolinguistic interest.
The project's team are
enjoying close contacts with representatives of respective indigenous
communities, established over a period of consistent fieldwork. In the
course of my
PhD thesis project, I
implemented, tested and refined the research methodology, documentation and
archiving techniques; performed initial familiarization of community
representatives with the nature of
Together with my colleagues I
performed necessary networking activities, securing cooperation of local
government authorities, community leaders and representatives. The proposed
documentation project serves
as a logical extension in the scope and depth of the
language database, encompassing
larger number of endangered Eastern Khanty dialects, wider variety of genre,
number and social strata of speakers.
The activities of the project focuses on
obtaining language data and metalinguistic information, archiving the data
and representing it in accessible formats. We conduct extensive
the native communities and document the endangered dialects of the
Khanty in their natural functional environment, and with an emphasis on
natural discourse. Previously collected data on the proposed
dialects are also re-archived in standardized, generally acceptable formats
supplementing the computerized database (text corpora, lexica, metadata,
ethnographic information, digital audio/video and photo documents). We also
perform preliminary processing of the data. The narrative discourse data are
consequently interlinearized and parsed, forming the
Eastern Khanty corpus
which we envision to be useful as a tool for educational and academic
purposes. The corpus is expected to consequently form a part of an aligned
parallel corpus of the Eastern Khanty – Russian – English. One of the
outcomes of the project is also the tri-lingual Khanty–Russian-English
dictionary in the ToolBox
format, potentially convertible into any combination of the three languages.
In the process of
archiving and processing,
use the equipment,
software and methodologies
that in my
experience, represent the best
current practice in
language documentation and archiving,
mostly originating form the DOBES
data will be made available on CD/DVD-media and on the web. Members of
will receive the printed and bound copies of the selected texts, dictionary,
as well as audio and video materials on the generally acceptable media.
The project has considerable intellectual
merit, as there is a glaring need for further documentation to provide the
basis for research and development of the teaching resources for the
Khanty dialects. It is important for adequate typological observations, to
secure the maximally authentic, diverse, and quantitatively significant
corpus of naturally occurring language data, along with the accurate
Such key research fields as
Areal Linguistics, Sociolinguistic Analysis, Comparative and Typological
Analysis, Aspects of Language Contact and Diffusion of Linguistic/Cultural
in such undoubtedly exciting
area as Siberia, will benefit
significantly from the project's outcome,
compared to current
uneven empirical plain resulting from inadequate documentation of the
areally adjacent languages/cultures. Thus, the project will fill the
empirical gap, still existing with regard to some Siberian languages,
particularly Eastern Khanty dialects.
The project’s broader impact will come from
continued collaboration with representatives of the communities and their
training in language documentation and ethnographic research. To cater for
community ownership and accessibility to the outcomes of the project, the
database of the archived audio/video documents will be accessible to the
native community representatives as well as for NGO’s, official agencies,
media and academicians active in the area.
Krauss, Michael. 1992. The world's
languages in crisis. Language 68(1).1-42.
History and News:
The principle work
within the project has started in 1999 with my
IPF fellowship, when I have started developing the the
objectives and methodologies of Eastern Khanty documentation, archiving and
analysis. The research and documentation activities then proceeded within
the variety of projects with different collaborators and various support
sources. In 2000 through 2003 my field work was supported by the
Department of Linguistics summer research grants. In 2003-2004 sessions of field
work were supported by the
FEL field research grants.
In 2004 my Eastern Khanty field research and linguistic analysis was
supported by the
dissertation support grant, and finally, in 2005-06 this work developed
into this specific documentation project supported by the joint
NEH/NSF DEL program.
- The OSI IPF term completed February, 2001.
- Rice University summer field research term completed
- The Yale ELF project term completed August 1, 2004.
- The British FEL project term completed December 1,
- The NSF/NEH project term completed October 31, 2006.