Multimedia documentation

of the endangered languages of Vasyugan and Alexandrovo Khanty

of Tomsk region in Siberia

(ELDP FTG 0135)







The project is sponsored by the ELDP field trip grant (FTG) and pursues field documentation of language data and meta information on two related and geographically adjacent endangered dialects of Eastern Khanty language in the Tomsk region of Russia: Vasyugan and Alexandrovo (total number of speakers under 100 pers). Multimedia data is recorded and archived in unified, conventional and accessible formats based on DoBes recommendations, resulting in the lexical database, corpus and preliminary data description. The team of linguists experienced in the area and having the working knowledge of the language conduct field visits to the Vasyugan and Alexandrovo Khanty communities documenting dialects in their natural functional environment.



The Department of Siberian Indigenous Languages

Tomsk State Pedagogical University

Komsomolsky pr.75, k.246

Tomsk 634041 Russia 

The Database of Endangered Languages of Ob-Yenisei

This project is a component of the Database of Endangered Languages of Ob-Yenisei river basins in Siberia, a multilingual data bank developed at the Department of Siberian Indigenous Languages, Tomsk State Pedagogical University since the fall of 2005. At present, the Database contains computer corpora, lexica and metalinguistic information on 10 languages, including the native Siberian dialects of Khanty, Ket, Selkup, Nganasan, Nenets, Enets, Chulym Turkic, Siberian Tatar, Evenk, Dolgan. From 2000, data collection, archiving and descriptions to form the database were prepared with the financial support of the ELF and FEL field research grants, NSF support grant, joint NEH/NSF DEL program grant, the Russian Foundation for Basic Research (RFBR) and the Russian Foundation for Humanities (RGNF).


The Endangered Language of Khanty

The language of the Khanty (a.k.a. Ostjak, Hanti, Khanti) forms, together with Mansi, the Ob'-Ugric subgroup of the Finno-Ugric group of the Uralic language family. Khanty is a group of dialect clusters, or languages (western and eastern). The dialects of interest in this study are a continuum of related river varieties of Eastern Khanty: Vasyugan Khanty and Alexandrovo Khanty, which are particularly interesting as they represent a reportedly more archaic and rich language systems in morphosyntactic terms. The total number of speakers of these diverse, linguistically prominent and extremely under-documented varieties is rapidly decreasing, principally placing these dialects in the group of languages in the imminent danger of extinction within a single generation. Eastern Khanty dialects demonstrate numerous features that attract considerable typological and sociolinguistic interest.

The project's researchers enjoy close cooperation of respective indigenous communities, established over a period of persistent fieldwork. The research teamI tests, refines and implements modern research methodology, documentation and archiving techniques; performs initial familiarization of community representatives with the nature of language documentation. The documentation project serves as a logical extension in the scope and depth of the existing pilot language database, encompassing larger number of endangered western Siberian dialects, wide genre variety, number and social strata of speakers.

The activities of the project focus on obtaining language data and metalinguistic information in the natural functional environment, archiving the data and representing it in accessible formats. The acquired language data undergo preliminary processing; narrative discourse is consequently interlinearized and parsed, forming the Eastern Khanty corpus, to be used as a tool for educational and academic purposes. One of the outcomes of the project is also the tri-lingual Khanty–Russian-English dictionary in the ToolBox format. The data can be made available on CD/DVD-media and on the web. Members of the Eastern Khanty communities receive the printed copies of the selected texts, dictionary, as well as audio and video materials on the generally acceptable media. There is a glaring need for documentation to provide the basis for the development of the teaching resources in the Khanty dialects.

The project is also important for adequate typological observations, to secure the maximally authentic, diverse, and quantitatively significant corpus of naturally occurring language data, along with the accurate metalinguistic information. Such key research fields as Areal Linguistics, Sociolinguistic Analysis, Comparative and Typological Analysis, Aspects of Language Contact and Diffusion of Linguistic/Cultural features in the Ob-Yenisei area of Siberia, will benefit significantly from the project's outcome. Thus, the project fills the empirical gap, existing with regard to some Siberian languages, particularly Eastern Khanty.

The project’s broader impact comes from continued collaboration with representatives of the communities in language documentation and ethnographic research. To cater for community ownership and accessibility to the outcomes of the project, the database of the archived audio/video documents will be accessible to the native community representatives as well as for NGO’s, official agencies, media and academicians active in the area.  

Project Collaborators

In course of this project the team consisted of 4 collaborators from the Department of Siberian Indigenous Languages at Tomsk State Pedagogical University, who made their valuable contributions to the project's design and output.

 Andrey Filchenko

Project director


Head of the Department of Siberian Indigenous Languages
Tomsk State Pedagogical University

- Khanty Language, syntax and semantics, information structure, corpus linguistics, language archiving, language variation and change, language contact, language policies.

filtchenko [at]


Potanina Olga

research fellow


Department of Siberian Indigenous Languages
Tomsk State Pedagogical University


- Khanty Language, lexical semantics, grammaticalization, language contact.


olgaponatina [at]


Lemskaya Valeria


 graduate student


Department of Siberian Indigenous Languages
Tomsk State Pedagogical University


- Chulym Turkic Language & Khanty Language, morphosyntax, corpus linguistics, language contact.


lemskaya [at]

Mindijarova Elmira


graduate student


Department of Siberian Indigenous Languages
Tomsk State Pedagogical University


- Tatar Language & Khanty Language, morphosyntax, language contact.


elmirka [at]

© Andrey Filchenko. Tomsk State Pedagogical University, Department of Siberian Indigenous Languages. Komsomolsky pr. 75, k.246, Tomsk 634041 Russia.

E-mail: filtchenko [at]
Last updated: April 2008.