The Kernel Mixture Network: A Nonparametric Method for Conditional Density Estimation of Continuous Random Variables
Evolutionary Algorithms in Theory and Practice: Evolution Strategies, Evolutionary Programming, Genetic Algorithms
Does Unlabeled Data Provably Help? Worst-Case Analysis of the Sample Complexity of Semi-Supervised Learning.
Some Phonatory and Resonatory Characteristics of the Rock, Pop, Soul, and Swedish Dance Band Styles of Singing
Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription
Evaluation of Quality of Sound Source Separation Algorithms: Human Perception vs Quantitative Metrics
Alignment of Lyrics With Accompanied Singing Audio Based on Acoustic-Phonetic Vowel Likelihood Modeling
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations Using Sequence-to-Sequence Autoencoder
Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach
New Types of Deep Neural Network Learning for Speech Recognition and Related Applications: An Overview
Online Singing Voice Separation Using a Recurrent One-Dimensional U-NET Trained with Deep Feature Losses
MuseGAN: Multi-Track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
Structured Dropout for Weak Label and Multi-Instance Learning and Its Application to Score-Informed Source Separation
Automatic Synchronization between Lyrics and Music CD Recordings Based on Viterbi Alignment of Segregated Vocal Signals
A Modeling of Singing Voice Robust to Accompaniment Sounds and Its Application to Singer Identification and Vocal-Timbre-Similarity-Based Music Information Retrieval
Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position
MetricGAN: Generative Adversarial Networks Based Black-Box Metric Scores Optimization for Speech Enhancement
Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks
Evidence for Distinguishing Pressed, Normal, Resonant, and Breathy Voice Qualities by Laryngeal Resistance and Vocal Efficiency in Vocally Trained Subjects
Recognition of Phonemes in A-Cappella Recordings Using Temporal Patterns and Mel Frequency Cepstral Coefficients
Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation
Recommendation ITU-R BS.1534-2: Method for the Subjective Assessment of Intermediate Quality Level of Audio Systems
Large-Scale Genetic Perturbations Reveal Regulatory Networks and an Abundance of Gene-Specific Repressors
One Deep Music Representation to Rule Them All? A Comparative Analysis of Different Representation Learning Strategies
Two Data Sets for Tempo Estimation and Key Detection in Electronic Dance Music Annotated from User Corrections
Cross-Task Learning for Audio Tagging, Sound Event Detection and Spatial Localization: DCASE 2019 Baseline Systems
Tracing the Dynamic Changes in Perceived Tonal Organization in a Spatial Representation of Musical Keys.
Joint Detection and Classification of Singing Voice Melody Using Convolutional Recurrent Neural Networks
Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach
Deep Learning for Acoustic Modeling in Parametric Speech Generation: A Systematic Review of Existing Techniques and Future Trends
Cutting Music Source Separation Some Slakh: A Dataset to Study the Impact of Training Data Quality and Quantity
Simultaneous Separation and Transcription of Mixtures with Multiple Polyphonic and Percussive Instruments
Lyrics-to-Audio Alignment and Phrase-Level Segmentation Using Incomplete Internet-Style Chord Annotations
Timbre and Melody Features for the Recognition of Vocal Activity and Instrumental Solos in Polyphonic Music.
Perceptual Scaling of Synthesized Musical Timbres: Common Dimensions, Specificities, and Latent Subject Classes
Learning to Separate Vocals from Polyphonic Mixtures via Ensemble Methods and Structured Output Prediction
Examining the Perceptual Effect of Alternative Objective Functions for Deep Learning Based Music Source Separation
An Automatic Singing Skill Evaluation Method for Unknown Melodies Using Pitch Interval Accuracy and Vibrato Features
VocaListener2: A Singing Synthesis System Able to Mimic a User’s Singing in Terms of Voice Timbre Changes as Well as Pitch and Dynamics
Improving the Learning Speed of 2-Layer Neural Networks by Choosing Initial Values of the Adaptive Weights
Illuminating the “black Box: A Randomization Approach for Understanding Variable Contributions in Artificial Neural Networks
Adaptation of Bayesian Models for Single-Channel Source Separation and Its Application to Voice/Music Separation in Popular Songs
Feature Selection Based on Mutual Information Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy
Combining Modeling Of Singing Voice And Background Music For Automatic Separation Of Musical Mixtures.
The Use of Articulatory Movement Data in Speech Synthesis Applications: An Overview—Application of Articulatory Movements Using Machine Learning Algorithms—
Automatic Classification of Phonation Modes in Singing Voice: Towards Singing Style Characterisation and Application to Ethnomusicological Recordings
PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications
Towards the Next Generation of Web-Based Experiments: A Case Study Assessing Basic Audio Quality Following the ITU-R Recommendation BS.1534 (MUSHRA)
A Matlab Toolbox for Efficient Perfect Reconstruction Time-Frequency Transforms with Log-Frequency Resolution
Intuitive and Efficient Computer-Aided Music Rearrangement with Optimised Processing of Audio Transitions
Training Generative Adversarial Networks from Incomplete Observations Using Factorised Discriminators
Singing Voice Enhancement in Monaural Music Signals Based on Two-Stage Harmonic/Percussive Sound Separation on Multiple Resolution Spectrograms
Improving Music Source Separation Based on Deep Neural Networks through Data Augmentation and Network Blending
What Auto-Encoders Could Learn from Brains-Generation as Feedback in Deep Unsupervised Learning and Inference
Evolutionary Multi-Objective Training Set Selection of Data Instances and Augmentations for Vocal Detection
An Overlap-Add Technique Based on Waveform Similarity (WSOLA) for High Quality Time-Scale Modification of Speech
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion
Intonation: A Dataset of Quality Vocal Performances Refined by Spectral Clustering on Pitch Congruence
Encoding Time Series as Images for Visual Inspection and Classification Using Tiled Convolutional Neural Networks
The Singing Power Ratio as an Objective Measure of Singing Voice Quality in Untrained Talented and Nontalented Singers