Mark Cartwright
Mark Cartwright
Assistant Professor, Informatics
3902E Guttenberg Information Technologies Center (GITC)
About Me
Mark Cartwright comes to NJIT from New York University, where he was a research assistant professor in the Department of Computer Science and Engineering, with affiliations to the Music and Audio Research Lab and the Center for Urban Science and Progress. His research lies at the intersection of human-computer interaction and machine learning applied to music and environmental audio in a field called machine listening. This field, the auditory sibling to computer vision, aims to endow machines with the ability to perceive and understand sound as humans do. It has applications such as assistive hearing devices for the hearing impaired, environmental acoustic monitoring, bioacoustics monitoring, music search / recommendation, intelligent media production tools, autonomous vehicle sensing, and more. Within this field, his work focuses on interactive solutions to problems, in which the user plays an active role in teaching the machine to understand particular audio concepts. During his PhD, he leveraged machine listening to develop easier-to-use interfaces for sound design and music production tools, e.g a synthesizer that you control by imitating sounds with your voice. This work enables novice users to use complex audio tools for creative expression that typically require significant knowledge and experience to use effectively. Recently, he has taken part in the Sounds of New York City, a large National Science Foundation (NSF)-funded project that leverages an acoustic sensor network along with citizen science and machine listening to monitor, analyze and mitigate urban noise pollution. This project aims to build tools that can measure the impact of different types of sounds (e.g. jackhammers) on the city’s noise pollution and create real-time alerts so that city agencies can respond more effectively.
Education
Ph.D.; Northwestern University; Computer Science; 2016
M.A.; Stanford University; Music Science and Technology; 2007
B.M.; Northwestern University; Music Technology; 2004
M.A.; Stanford University; Music Science and Technology; 2007
B.M.; Northwestern University; Music Technology; 2004
Website
2024 Fall Courses
IS 488 - INDEPENDENT STUDY IN INFO
IS 701B - MASTER'S THESIS
IS 726 - INDEPENDENT STUDY II
IS 776 - IS RESEARCH STUDY
IS 790A - DOCT DISSERTATION & RES
IS 792 - PRE-DOCTORAL RESEARCH
IS 465 - ADVANCED INFORMATION SYSTEMS
IS 489 - INFO UNDERGRAD THESIS RESEARCH
IS 700B - MASTER'S PROJECT
IS 725 - INDEPENDENT STUDY I
IS 701B - MASTER'S THESIS
IS 726 - INDEPENDENT STUDY II
IS 776 - IS RESEARCH STUDY
IS 790A - DOCT DISSERTATION & RES
IS 792 - PRE-DOCTORAL RESEARCH
IS 465 - ADVANCED INFORMATION SYSTEMS
IS 489 - INFO UNDERGRAD THESIS RESEARCH
IS 700B - MASTER'S PROJECT
IS 725 - INDEPENDENT STUDY I
Teaching Interests
machine listening, interactive machine learning, audio processing, multimedia computing
Past Courses
CS 485: SELECTED TOPICS IN CS
CS 485: ST: MACHINE LISTENING
CS 698: ST: MACHINE LISTENING
IS 247: DESIGNING THE USER EXPERIENCE
IS 485: SPECIAL TOPICS IN INFORMATION SYSTEMS
IS 485: SPECIAL TOPICS IN IS - I
IS 485: ST: MACHINE LISTENING
IS 657: SPATIOTEMPORAL URBAN ANALYTICS
IS 698: ST: SPECIAL PROJECTS
CS 485: ST: MACHINE LISTENING
CS 698: ST: MACHINE LISTENING
IS 247: DESIGNING THE USER EXPERIENCE
IS 485: SPECIAL TOPICS IN INFORMATION SYSTEMS
IS 485: SPECIAL TOPICS IN IS - I
IS 485: ST: MACHINE LISTENING
IS 657: SPATIOTEMPORAL URBAN ANALYTICS
IS 698: ST: SPECIAL PROJECTS
Research Interests
machine listening, human computer interaction, machine learning, audio, music, crowdsourcing, creativity support tools, interactive machine learning, music information retrieval, digital signal processing
Conference Proceeding
Multi-label Open-set Audio Classification
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), September 2023
Does a quieter city mean fewer complaints? The Sounds of New York City During COVID-19 Lockdown
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2023
A Study on Robustness to Perturbations for Representations of Environmental Sound
August 2022
Urban Rhapsody: Large-scale exploration of urban soundscapes
June 2022
How people who are deaf, Deaf, and hard of hearing use technology in creative sound activities
ACM, October (4th Quarter/Autumn) 2022
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), September 2023
Does a quieter city mean fewer complaints? The Sounds of New York City During COVID-19 Lockdown
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2023
A Study on Robustness to Perturbations for Representations of Environmental Sound
August 2022
Urban Rhapsody: Large-scale exploration of urban soundscapes
June 2022
How people who are deaf, Deaf, and hard of hearing use technology in creative sound activities
ACM, October (4th Quarter/Autumn) 2022
SHOW MORE
Active Few-Shot Learning for Sound Event Detection
September 2022
MONYC: Music of New York City Dataset
Weakly Supervised Source-Specific Sound Level Estimation in Noisy Soundscapes
2021
Who Calls the Shots? Rethinking Few-Shot Learning for Audio
2021
Few-Shot Continual Learning for Audio Classification
IEEE, June 2021
Specialized Embedding Approximation for Edge Intelligence: A Case Study in Urban Sound Classification
IEEE, June 2021
Few-Shot Drum Transcription in Polyphonic Music
2020
SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context
2020
Tricycle: Audio Representation Learning from Sensor Network Data Using Self-Supervision
IEEE, October (4th Quarter/Autumn) 2019
Voice Anonymization in Urban Sound Recordings
IEEE, October (4th Quarter/Autumn) 2019
SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network
Zenodo, July (3rd Quarter/Summer) 2019
Active Learning for Efficient Audio Annotation and Classification with a Large Amount of Unlabeled Data
IEEE, May 2019
Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists
ACM, May 2019
EdgeL3: Compressing L3-Net for Mote Scale Urban Noise Monitoring
IEEE, May 2019
Machine-Crowd-Expert Model for Increasing User Engagement and Annotation Quality
ACM, May 2019
Increasing Drum Transcription Vocabulary Using Data Synthesis
2018
Crowdsourced Pairwise-Comparison for Source Separation Evaluation
IEEE, April (2nd Quarter/Spring) 2018
Investigating the Effect of Sound-Event Loudness on Crowdsourced Audio Annotations
IEEE, April (2nd Quarter/Spring) 2018
Scaper: A library for soundscape synthesis and augmentation
IEEE, October (4th Quarter/Autumn) 2017
The Moving Target in Creative Interactive Machine Learning
2016
An Approach to Audio-Only Editing for Visually Impaired Seniors
ACM, October (4th Quarter/Autumn) 2016
Fast and easy crowdsourced perceptual audio evaluation
IEEE, March 2016
Audio Production with Intelligent Machines
2015
MixViz: A Tool to Visualize Masking in Audio Mixes
2015
VocalSketch: Vocally Imitating Audio Concepts
ACM, April (2nd Quarter/Spring) 2015
SynthAssist: Querying an Audio Synthesizer by Vocal Imitation
2014
Translating Sound Adjectives by Collectively Teaching Abstract Representations
2014
SynthAssist: an audio synthesizer programmed with vocal imitation
ACM, November 2014
Mixploration: Tethinking the audio mixer interface
ACM, February 2014
Social-EQ: Crowdsourcing an Equalization Descriptor Map
2013
Building a Music Search Database Using Human Computation
2012
Novelty measures as cues for temporal salience in audio similarity
ACM Press, 2012
Interactive Learning for Creativity Support in Music Production
2011
Making Searchable Melodies: Human vs. Machine
2011
Crowdsourcing a Real-World On-Line Query By Humming System
2010
Rage in Conjunction with the Machine
ACM Press, 2007
September 2022
MONYC: Music of New York City Dataset
Weakly Supervised Source-Specific Sound Level Estimation in Noisy Soundscapes
2021
Who Calls the Shots? Rethinking Few-Shot Learning for Audio
2021
Few-Shot Continual Learning for Audio Classification
IEEE, June 2021
Specialized Embedding Approximation for Edge Intelligence: A Case Study in Urban Sound Classification
IEEE, June 2021
Few-Shot Drum Transcription in Polyphonic Music
2020
SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context
2020
Tricycle: Audio Representation Learning from Sensor Network Data Using Self-Supervision
IEEE, October (4th Quarter/Autumn) 2019
Voice Anonymization in Urban Sound Recordings
IEEE, October (4th Quarter/Autumn) 2019
SONYC Urban Sound Tagging (SONYC-UST): a multilabel dataset from an urban acoustic sensor network
Zenodo, July (3rd Quarter/Summer) 2019
Active Learning for Efficient Audio Annotation and Classification with a Large Amount of Unlabeled Data
IEEE, May 2019
Crowdsourcing Multi-label Audio Annotation Tasks with Citizen Scientists
ACM, May 2019
EdgeL3: Compressing L3-Net for Mote Scale Urban Noise Monitoring
IEEE, May 2019
Machine-Crowd-Expert Model for Increasing User Engagement and Annotation Quality
ACM, May 2019
Increasing Drum Transcription Vocabulary Using Data Synthesis
2018
Crowdsourced Pairwise-Comparison for Source Separation Evaluation
IEEE, April (2nd Quarter/Spring) 2018
Investigating the Effect of Sound-Event Loudness on Crowdsourced Audio Annotations
IEEE, April (2nd Quarter/Spring) 2018
Scaper: A library for soundscape synthesis and augmentation
IEEE, October (4th Quarter/Autumn) 2017
The Moving Target in Creative Interactive Machine Learning
2016
An Approach to Audio-Only Editing for Visually Impaired Seniors
ACM, October (4th Quarter/Autumn) 2016
Fast and easy crowdsourced perceptual audio evaluation
IEEE, March 2016
Audio Production with Intelligent Machines
2015
MixViz: A Tool to Visualize Masking in Audio Mixes
2015
VocalSketch: Vocally Imitating Audio Concepts
ACM, April (2nd Quarter/Spring) 2015
SynthAssist: Querying an Audio Synthesizer by Vocal Imitation
2014
Translating Sound Adjectives by Collectively Teaching Abstract Representations
2014
SynthAssist: an audio synthesizer programmed with vocal imitation
ACM, November 2014
Mixploration: Tethinking the audio mixer interface
ACM, February 2014
Social-EQ: Crowdsourcing an Equalization Descriptor Map
2013
Building a Music Search Database Using Human Computation
2012
Novelty measures as cues for temporal salience in audio similarity
ACM Press, 2012
Interactive Learning for Creativity Support in Music Production
2011
Making Searchable Melodies: Human vs. Machine
2011
Crowdsourcing a Real-World On-Line Query By Humming System
2010
Rage in Conjunction with the Machine
ACM Press, 2007
COLLAPSE
Conference Abstract
A retrospective on monitoring noise pollution with machine learning in the Sounds of New York City project
Journal of the Acoustical Society of America, May 2023
Journal of the Acoustical Society of America, May 2023
Journal Article
Mendez Mendez, Ana , & Cartwright, Mark, & Bello, Juan Pablo, & Nov, Oded (2022). Eliciting Confidence for Improving Crowdsourced Audio Annotations. ACM, 6(CSCW1),
Pardo, Bryan, & Cartwright, Mark, & Seetharaman, Prem, & Kim, Bongjun (2019). Learning to Build Natural Audio Production Interfaces. Arts, 8(3), 110.
McFee, Brian, & Kim, Jong Wook, & Cartwright, Mark, & Salamon, Justin, & Bittner, Rachel M., & Bello, Juan Pablo (2019). Open-Source Practices for Music Signal Processing Research: Recommendations for Transparent, Sustainable, and Reproducible Audio Research. IEEE Signal Processing Magazine, 36(1), 128--137.
Lostanlen, Vincent, & Salamon, Justin, & Cartwright, Mark, & McFee, Brian, & Farnsworth, Andrew, & Kelling, Steve, & Bello, Juan Pablo (2019). Per-Channel Energy Normalization: Why and How. IEEE Signal Processing Letters, 26(1), 39--43.
Cartwright, Mark, & Seals, Ayanna, & Salamon, Justin, & Williams, Alex, & Mikloska, Stephanie, & MacConnell, Duncan, & Law, Edith, & Bello, Juan Pablo, & Nov, Oded (2017). Seeing Sound: Investigating the Effects of Visualizations and Complexity on Crowdsourced Audio Annotations.
Pardo, Bryan, & Cartwright, Mark, & Seetharaman, Prem, & Kim, Bongjun (2019). Learning to Build Natural Audio Production Interfaces. Arts, 8(3), 110.
McFee, Brian, & Kim, Jong Wook, & Cartwright, Mark, & Salamon, Justin, & Bittner, Rachel M., & Bello, Juan Pablo (2019). Open-Source Practices for Music Signal Processing Research: Recommendations for Transparent, Sustainable, and Reproducible Audio Research. IEEE Signal Processing Magazine, 36(1), 128--137.
Lostanlen, Vincent, & Salamon, Justin, & Cartwright, Mark, & McFee, Brian, & Farnsworth, Andrew, & Kelling, Steve, & Bello, Juan Pablo (2019). Per-Channel Energy Normalization: Why and How. IEEE Signal Processing Letters, 26(1), 39--43.
Cartwright, Mark, & Seals, Ayanna, & Salamon, Justin, & Williams, Alex, & Mikloska, Stephanie, & MacConnell, Duncan, & Law, Edith, & Bello, Juan Pablo, & Nov, Oded (2017). Seeing Sound: Investigating the Effects of Visualizations and Complexity on Crowdsourced Audio Annotations.