Skip to content

Latest commit

 

History

History
39 lines (35 loc) · 3.86 KB

README.md

File metadata and controls

39 lines (35 loc) · 3.86 KB

Team C0FF33 - Google Mentorship Program

Shayan Chowdhury, Matthew Ming, Larry Wong, Henry He

Mentor: Alex Liu

Blurb: Machine learning project for detecting emotion from vocal input.

Description

Our current idea is a research-style project that revolves around detecting emotion through the way people speak. In everyday life, people express their words through a multitude of tones. While we currently have many types of softwares and algorithms that allow machines to understand our voices, there are some problems with it. These variances are exhibited in the way we speak when we are angry or excited, but are often difficult to translate well through voice recognition software. In other words, the way we express ourselves can produce a large amount of noisy data, which in turn limits the amount of relevant information that we can give to a machine, rather than providing the subtle details that humans are able to recognize to said machines to learn from. While accents and dialects are specific to the groups of people, we believe that the tones that people use to express certain emotions are universal. As such, this should not be a factor that hinders higher level voice recognition. This is what that we aim to address in this mentorship. As part of our procedure, we plan on training a machine learning model to be able to carry this task out. So far, we’ve found some publicly available datasets for training, yet we’re also considering using vocal clips from movie lines as part of them. Various factors can act as features for the model including the tone, amplitude, pitch, tempo but we aren’t sure on the implementation of this or to what degree these features should be taken into account relative to each other. A possible extension to this project that we may consider is the programming language to use when writing this. Although writing in Python, would be significantly easier to write as well as being more readable and comprehensible, it would be significantly slower than using a higher level programming language like C or even Java, especially if we’re considering using data in real time to detect emotions in.

Future Application/Implications

  • Allow machines to better understand a person’s mood
    • Personal control and comfort tool to help a person cope with any emotional situation
    • Entertainment tool better suited and specialized for varied audiences

Machines only take words literally and allowing them to input emotion allows a degree of nuance in their responses. With the increasing prevalence of personal assistants in homes, with the user’s consent, it can also be used to identify a their state of mind and mental health based off the tonality of their voice among other factors.

Required Libraries:

Text Analysis:

  • Keras and Tensorflow
  • Numpy
  • nltk
  • sklearn
  • speechrecognition

Resources

Articles

Data Files

Free

Requires Permission/ Account