so-vits-svc fork with realtime support, improved interface and more features.
-
Updated
Jan 27, 2026 - Python
so-vits-svc fork with realtime support, improved interface and more features.
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Phoneme segmentation using pre-trained speech models
A mini, simple, and fast end-to-end automatic speech recognition toolkit.
[ICASSP 2023] Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-trained Representations
This repo contains the source code of the first deep learning-base singing voice beat tracking system. It leverages WavLM and DistilHuBERT pre-trained speech models to create vocal embeddings and trains linear multi-head self-attention layers on top of them to extract vocal beat activations. Then, it uses HMM decoder to infer signing beats and t…
An implementation of Speech Emotion Recognition, based on HuBERT model, training with PyTorch and HuggingFace framework, and fine-tuning on the RAVDESS dataset.
TypeScript implementation of Blockchain Commons specifications: deterministic CBOR (dCBOR), Gordian Envelope for privacy-preserving data, Uniform Resources (UR), secret sharing (SSKR/Shamir), XID decentralized identity, provenance marks, LifeHash visual hashing, and FROST threshold signatures.
The official implementation of the method discussed in the paper Improving Spoken Language Identification with Map-Mix(work accepted at ICASSP-2023)
Cover Song Powered by SoftVC VITS
Google collab for testing SoftVC VITS Singing Voice Conversion for AI capable of changing the singer within music files.
unsupervised spoken utterances scoring
The code for the MAPSS measures for source separation evaluation.
Layer-aware TDNN: Speaker Recognition Using Multi-Layer Features from Pre-Trained Models, to appear in ICAIIC 2026
Rethinking Leveraging Pre-Trained Multi-Layer Representations for Speaker Verification, ISCA Interspeech 2025
code for our paper DistilALHuBERT: A Distilled Parameter Sharing Audio Representation Model
Advanced Speech Emotion Recognition, based on ExHuBERT: Enhancing HuBERT Through Block Extension and Fine-Tuning on 37 Emotion Datasets and 14 languages (Emotions: Disgust, Neutral, Kind, Anger, Surprise, Joy)
Speech Keyword detection using Wav2Vec Model
Add a description, image, and links to the hubert topic page so that developers can more easily learn about it.
To associate your repository with the hubert topic, visit your repo's landing page and select "manage topics."