🌐
Hugging Face
huggingface.co › facebook › wav2vec2-base
facebook/wav2vec2-base · Hugging Face
July 25, 2025 - wav2vec2 · pretraining · speech · arxiv: 2006.11477 · License: apache-2.0 · Model card Files Files and versions · xet Community · 11 · Deploy · Use this model · Facebook's Wav2Vec2 · The base model pretrained on 16kHz sampled speech audio.
🌐
Hugging Face
huggingface.co › facebook › wav2vec2-large
facebook/wav2vec2-large · Hugging Face
July 25, 2025 - wav2vec2 · pretraining · speech · arxiv: 2006.11477 · License: apache-2.0 · Model card Files Files and versions · xet Community · 2 · Deploy · Use this model · Facebook's Wav2Vec2 · The base model pretrained on 16kHz sampled speech audio.
🌐
Meta
ai.meta.com › blog › wav2vec-20-learning-the-structure-of-speech-from-raw-audio
Wav2vec 2.0: Learning the structure of speech from raw audio
Facebook AI is releasing code and models for wav2vec 2.0, a self-supervised algorithm that enables automatic speech recognition models with just 10 minutes of transcribed speech data.
🌐
Hugging Face
huggingface.co › facebook › wav2vec2-base-960h
facebook/wav2vec2-base-960h · Hugging Face
January 16, 2024 - from datasets import load_dataset from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor import torch from jiwer import wer librispeech_eval = load_dataset("librispeech_asr", "clean", split="test") model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h").to("cuda") processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h") def map_to_pred(batch): input_values = processor(batch["audio"]["array"], return_tensors="pt", padding="longest").input_values with torch.no_grad(): logits = model(input_values.to("cuda")).logits predicted_ids = torch.argmax(logits, dim=-1) transcription = processor.batch_decode(predicted_ids) batch["transcription"] = transcription return batch result = librispeech_eval.map(map_to_pred, batched=True, batch_size=1, remove_columns=["audio"]) print("WER:", wer(result["text"], result["transcription"]))
🌐
Metatext
metatext.io › models › facebook-wav2vec2-base
facebook/wav2vec2-base Model - NLP Hub
Metatext is a platform that allows you to build, train and deploy NLP models in minutes.
🌐
GitHub
github.com › oliverguhr › wav2vec2-live
GitHub - oliverguhr/wav2vec2-live: A live speech recognition using Facebooks wav2vec 2.0 model.
A live speech recognition using Facebooks wav2vec 2.0 model. - oliverguhr/wav2vec2-live
Starred by 374 users
Forked by 58 users
Languages   Python
🌐
GitHub
github.com › facebookresearch › fairseq › blob › main › examples › wav2vec › README.md
fairseq/examples/wav2vec/README.md at main · facebookresearch/fairseq
August 5, 2022 - # !pip install transformers # !pip install datasets import soundfile as sf import torch from datasets import load_dataset from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor # load pretrained model processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h") model = ...
Author   facebookresearch
🌐
GitHub
github.com › openvinotoolkit › open_model_zoo › blob › master › models › public › wav2vec2-base › README.md
open_model_zoo/models/public/wav2vec2-base/README.md at master · openvinotoolkit/open_model_zoo
Wav2Vec2.0-base is a model, which pre-trained to learn speech representations on unlabeled data as described in wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations paper and fine-tuned for speech recognition task ...
Author   openvinotoolkit
Find elsewhere
🌐
AI Models
aimodels.fyi › models › huggingFace › wav2vec2-large-960h-lv60-self-facebook
wav2vec2-large-960h-lv60-self | AI Model Details
Facebook's Wav2Vec2 is a large model pretrained and fine-tuned on 960 hours of Libri-Light and Librispeech on 16kHz sampled speech audio. The model was trained with a Self-Training objective.
🌐
Metatext
metatext.io › models › facebook-wav2vec2-large
facebook/wav2vec2-large Model - NLP Hub
Metatext is a platform that allows you to build, train and deploy NLP models in minutes.
🌐
arXiv
arxiv.org › abs › 2006.11477
[2006.11477] wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
October 22, 2020 - We show for the first time that ... wav2vec 2.0 masks the speech input in the latent space and solves a contrastive task defined over a quantization of the latent representations which are jointly learned....
🌐
AI Models
aimodels.fyi › models › huggingFace › wav2vec2-base-facebook
wav2vec2-base | AI Model Details
June 14, 2025 - Skip to main content · Application error: a client-side exception has occurred while loading www.aimodels.fyi (see the browser console for more information) · Privacy PolicyTerms of Service
🌐
Medium
varharsha.medium.com › automatic-speech-recognition-using-facebook-word2vec2-0-db0604df0820
Automatic Speech Recognition using Facebook Wav2Vec2.0
June 2, 2021 - Wav2Vec2 can thus, allow for improved automatic speech recognition for many more languages and domains with much less annotated data but State of the art results. Thank you for reading!
🌐
X
x.com › PatrickPlaten › status › 1389985153586450433
X
JavaScript is not available · We’ve detected that JavaScript is disabled in this browser. Please enable JavaScript or switch to a supported browser to continue using x.com. You can see a list of supported browsers in our Help Center · Help Center · Terms of Service Privacy Policy Cookie ...
🌐
PyTorch
docs.pytorch.org › audio › 0.9.0 › models.html
torchaudio.models — Torchaudio 0.9.0 documentation
>>> from torchaudio.models.wav2vec2.utils import import_huggingface_model >>> >>> original = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h") >>> model = import_huggingface_model(original) >>> >>> waveforms, _ = torchaudio.load("audio.wav") >>> logits, _ = model(waveforms)
🌐
Analytics Vidhya
analyticsvidhya.com › home › introduction to hugging face’s transformers v4.3.0 and its first automatic speech recognition model – wav2vec2
Introduction to Hugging Face's Transformers v4.3.0 and its First Automatic Speech Recognition Model - Wav2Vec2
February 15, 2021 - And Hugging Face has no plans to stop its growing applications. Hugging Face just dropped the State-of-the-art Natural Language Processing library Transformers v4.30 and it has extended its reach to Speech Recognition by adding one of the leading Automatic Speech Recognition models by Facebook called the Wav2Vec2.
🌐
Lightrun
lightrun.com › answers › speechbrain-speechbrain-cant-load-feature-extractor-for-facebookwav2vec2-large-xlsr-53
Can't load feature extractor for 'facebook/wav2vec2-large-xlsr-53'
# Url for xlsr wav2vec2 #wav2vec2_hub: C:\Users\nathk\Documents\TacoProject\brain\speechbrain\xlsr_53_56k.pt wav2vec2_hub: facebook/wav2vec2-large-xlsr-53 #wav2vec2_hub: C:/Users/nathk/Documents/TacoProject/brain/speechbrain/recipes/DVoice/ASR/CTC/results/wav2vec2_ctc_FONGBE/1250/save/wav2vec2_checkpoint/config.json # Data files data_folder: /localscratch/ALFFA_PUBLIC/ASR/FONGBE/data train_csv_file: !ref <data_folder>/train.csv