Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sphinx lunch #173

Merged
merged 2 commits into from
Oct 12, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions _pages/sphinx-lunch.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,13 @@ A tentative schedule can be found [here](https://docs.google.com/spreadsheets/d/

## Future Talks (tentative schedule)

## Previous Talks

- October 12, 2023
- Title: Computational Audition through Imprecise labels
- Speaker: Ankit Shah

## Previous Talks
- Abstract: In this talk, we delve into computational auditory processing to mimic how humans and animals interpret sounds to interact with their surroundings effectively. The journey begins with the machine's challenge to recognize a vast array of sounds limited by the known sounds in our datasets. This limitation becomes glaring as current models require large labeled datasets for accuracy, which often isn't feasible in real-world settings due to data scarcity. We then spotlight core issues: the strength of sound labels within available datasets. The quandary is that even with a fraction of known sounds and limited data, inaccuracies in sound labeling lead to suboptimal models. Our focus shifts to devising strategies for sound modeling amidst inaccurate, weak or incomplete labels, termed as working with imprecise labeled data. Our exploration includes enhancing the existing annotations, understanding the effects of label noise and corruption, and innovating a co-training approach for learning sound events from web data without human intervention. We venture into exploiting additional cues like event counts and durations with negligible extra effort, introducing the concept of semi-weak labels. Lastly, the talk describes a unified framework encapsulating all our approaches, making a robust model capable of handling various labeling scenarios, paving a solid foundation for future endeavors in understanding and modeling the world of images (transferrable to sounds), irrespective of label availability. Through this, we aspire to bridge the gap between the human brain's natural sound-processing ability and machines, opening doors to a more harmonious interaction with the acoustic world around us.
- Bio: Ankit Shah is a Ph.D. student in the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University. Ankit earned his master's in Language technologies at Carnegie Mellon University in 2019 and his bachelor's in electronics and communication engineering from the National Institute of Technology Karnataka Surathkal. He has worked in the industry for over 4 years as a verification engineer and project lead at ARM and as a Deep learning research Scientist at ReviveMed before joining the Ph.D. program. His areas of interest are audio understanding, machine learning, and deep learning. His thesis focuses on learning in the presence of weak, uncertain, and incomplete labels, where he has made several key contributions, including the setting up DCASE challenges on the topic. He has won the Gandhian Young Technological Innovator (GYTI) award in India for his contribution to building a never-ending learner of sound systems. His team recently emerged as a winning team in the NYC AI Hackathon challenge on LLM (Large Language Model and generative AI. He enjoys reading several books during the year, listens to music, and loves to travel. Further, he is keenly interested in Economics, Startups, Entrepreneurship, etc. Website: https://ankitshah009.github.io

- October 5, 2023
- Title: Adaptive Non-Causality for Speech Recognition
Expand Down
Loading