Мы используем файлы cookies для улучшения работы сайта НИУ ВШЭ и большего удобства его использования. Более подробную информацию об использовании файлов cookies можно найти здесь, наши правила обработки персональных данных – здесь. Продолжая пользоваться сайтом, вы подтверждаете, что были проинформированы об использовании файлов cookies сайтом НИУ ВШЭ и согласны с нашими правилами обработки персональных данных. Вы можете отключить файлы cookies в настройках Вашего браузера.

  • A
  • A
  • A
  • АБВ
  • АБВ
  • АБВ
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта

Speech Technologies

2024/2025
Учебный год
ENG
Обучение ведется на английском языке
6
Кредиты
Статус:
Курс по выбору
Когда читается:
2-й курс, 1, 2 модуль

Преподаватель


Гурков Иван Евгеньевич

Course Syllabus

Abstract

The course introduces students to the basic principles and methods of speech signal analysis and automatic synthesis, as well as automatic speech recognition. Students obtain an understanding of the acoustics of the speech signal, learn to apply various tools for its processing and markup. Students are also introduced to existing speech recognition and synthesis systems and learn to apply them in practice.
Learning Objectives

Learning Objectives

  • Familiarization with the methods of signal processing
  • Familiarization with the method of recognition and synthesis of speech
  • Recognition by the student of the system and the model of synthesis and recognition
Expected Learning Outcomes

Expected Learning Outcomes

  • has an idea of the acoustic theory of speech formation, operates basic acoustic concepts (frequency, period, amplitude, resonator, spectrum, harmonics, formants, basic tone)
  • possesses skills of signal processing: construction of instantaneous spectra and sonograms, calculation of formants, signal markup in Praat program, manipulation of signal properties (amplitude, basic tone)
  • is oriented in the basic methods of speech signal synthesis (compilative: subphonetic, allophonetic, diphonetic, syllabic, macrosynthesis, unit selection; parametric, articulatory)
  • possesses skills of sound base development for compilative synthesis
  • is fluent in the apparatus of automatic speech recognition system (ASR): acoustic model, language model, decoder
  • possesses skills of extracting acoustic features relevant for ASR from the signal using Kaldi or Python
  • understands the principles of creating pronunciation dictionaries, is oriented in methods and tools of their development
  • possesses the skills of applying ASR systems in practice and evaluating the quality of recognition.
Course Contents

Course Contents

  • Acoustic theory of speech formation
  • Acoustic analysis of speech signal
  • History of speech technologies
  • Directions of speech synthesis
  • Compilative synthesis of speech
  • Automatic transcription and text normalization
  • General information about ASR systems
  • Acoustic modeling in ASR systems
  • Language modeling and dictionaries in ASR systems
  • Finding the right solution
Assessment Elements

Assessment Elements

  • non-blocking Homework
    Homework: includes practical assignments
  • non-blocking Exam
    The examination is conducted in verbal form by tickets. Each ticket contains two questions
Interim Assessment

Interim Assessment

  • 2024/2025 2nd module
    0.3 * Exam + 0.7 * Homework
Bibliography

Bibliography

Recommended Core Bibliography

  • Speech and language processing, Jurafsky, D., 2014

Recommended Additional Bibliography

  • A history of communications : media and society from the evolution of speech to the Internet, Poe, M. T., 2011

Authors

  • Kolmogorova Anastasiia Vladimirovna
  • Корнева Анна Михайловна
  • KESSEL KSENIIA VITALEVNA