Multi-task Speech Semantic Recognition having Few-shot Behavior
Because semantic information such as arousal, valence, or abstract emotional label(i.e. happy, sad, angry…) are often manually gathered, speech emotion recognition(SER) research usually suffers from scarcity of emotional label of speech.
Our few-shot technologies can be applied to this data scarcity conditions, based on transfer learning and multi-task learning. Then the ANN model can be derived into high performances even in the condition of data shortage.
(Blue: not adjusting humelo's few-shot technology on our DNN model)
(Red: applying humelo's few-shot technology on our DNN model)
* The detail of models & results will be updated after patent applications