Muma
PUBLIC
Canada, University of Toronto
Team Gallery
No gallery images have been uploaded.
Project Overview
The problem that Muma addresses is a technical one. The current music industry standard for musicians, performance right organizations (SOCAN, ASCAP, BMI, etc), and audio matching services (Shazam, SoundHound, etc) to detect music usage online is to use fingerprinting technology. Fingerprinting technology excels at matching original recordings with recordings that are similar, but falter when matching original recordings with ones that are heavily altered. Even if someone covers Hotline Bling by Drake in a country style, Drake, his producer, the lyricists, the composers, anyone who owns a right to the song should still be paid royalties for whatever part of the composition that was used.
Muma offers a way to reach the currently uncaptured revenue from user generated content.
Muma is able to handle changes in genre, pitch, tempo, audio quality and extract the invariant feature –lyrics. The majority of user generated content (UGC) for music does not alter lyrics drastically. Our solution combines multi-layered convolution neural networks and data analysis of peripheral information (such as comments on a YouTube video) in order to produce accurate lyrics. Our solution has the following flow:
UGC Song → CNN to get Genre ID with Azure Computer Vision API → use Google Speech Recognition to get unfiltered lyrics → use Azure Text Analytics API to get filtered lyrics → use filtered lyrics and Bing Web Search API to match UGC to original content
Using Muma, PROs will be able to reach the remaining 20% and thus claim more content and ultimately collect more royalties for their members.
Our team started out working on a program to help increase the royalties collected for artists, but we believe that outside of the music industry, our technology can be used by video service providers to translate and caption complex audio to improve the experience for the hearing impaired.
Currently, services such as YouTube cannot automatically caption complex audio. The majority of music videos either do not have captions or have been manually captioned. But music video are not the only videos with complex audio, there are yoga instruction videos with background music, vlogs, tutorials, gaming videos, and a number of other genres of videos were accurate captions are not available. Using Muma, we can make a wide range of content previously either manually captioned or simply unavailable accessible to the hearing impaired.
About Team
Our team consists of five members in total: Irene Lin, Shunzhe Yu, YiMing Han, Gabriel Bussieres, Adamo Carolli. All members have a background in computer science; however, what makes our group cohesive and productive is our individual specializations.
Irene Lin: UX/UI, Design
4th year, Honours B.S Cognitive Science
Advanced Year GPA 3.80/4.00
Founder and President, UDesign Studio, 2015 – Present
Oversee client acquisition and individual project managers for over 11 unique projects annually for local businesses, startups and organizations. Clients include LevelUp Reality Games, Sunnybrook Hospital Research Centre, Hart House Farms, and others. Website: udesignstudio.ca
Web Developer, University of Toronto Faculty of Arts & Science, May 2017 – Present
Designed and developed child themes using Genesis framework for Teaching and Learning Support website and First Year Learning Communities website.
Project Developer, Kinetica Dynamics, December 2017 – April 2017
Performed business analysis and created designs for new website.
Shunzhe Yu: Machine Learning
4th year, Honour B.S Computer Science
Advanced Year GPA 3.95/4.00
Undergraduate Researcher with Prof. Ashton Anderson, Aug 2017 – Present
Apply Natural Language Processing to Reddit comments to analyze dynamic loyalty of users in different communities.
Data Scientist, Sysomos, May 2016 – May 2017
Data analysis on social media data. Implemented facial matching, video summarization, image captioning.
Founder, Whalesper, January 2017 – Present
News aggregation app that connects users to local services and deals. 20k users.
Team Leader & Main Code Author, Human Expression Classification 2015
Autonomous Driving, Computer Vision Project Code Contributor, 2016
YiMing Han: DevOps
4th year, Honours B.S Computer Science
Advanced Year GPA 3.80/4.00
DevOps Developer Co-op, Finastra (Formerly D+H) May 2016 – Aug 2017
Worked closely with Dev team to implemented cloud environments setup and deployment automation (including disaster recovery) for Barometer application independently. (http://www.dh.com/product/barometer)
Team Leader, Penguin Rush Development Sept 2017-2018
Led a team of 10 students to develop a student game to be presented at Level Up student video game Showcase in April 2018 using Unity3D. Game Trailer: https://www.youtube.com/watch?v=o4CNXGsTZmU&t
Gabriel Bussieres: Fullstack dev, testing
4th year, Honours B.S Computer Science
Advanced Year GPA 3.65/4.00
Jr. Software Engineer in Test, Flipp Corporation May 2016 to September 2017
Led testing for the critical back-end system of user accounts. Includes testing, testing smoke, and performance testing. Built the integration tests from the ground up.
Royal Ontario Museum Developer, Summer 2015
Developed a photo booth with a partner using a Kinect V1 Sensor and Unity3D. Functionalities implemented include changing outfits, changing the background, saving photos, and sending those photos to museum visitors via email
Adamo Carolli: Fullstack dev, Project Management
4th year, Honours B.S Computer Science
Advanced Year GPA 3.80/4.00
Project Coordinator Intern at Hydro One Telecom, 2016-2017
Coordinated with clients, project team, technicians and external partners to ensure prompt project deliverables that meet or exceed customer timelines
Full Stack Developer Intern at SparkGig, Summer 2014
Implemented front and back-end sections of the re-design relating to performer search and the database schema’s required to facilitate sign-up and customer/performer communication.
Royal Ontario Museum Developer, Summer 2015
Machine Learning algorithms to allow statue’s and other museum artifacts to intelligently respond to museum goers over social media