Hypothesis

20 Matching Annotations

Jun 2023
platform.openai.com platform.openai.com

OpenAI API

1
1. chrisaldrich 02 Jun 2023
  
  in Public
  
  OpenAI API
  
  OpenAI speech to text APIs OEG Live 2023-06-02 transcriptions translations
Visit annotations in context

Tags

speech to text

translations

OpenAI

transcriptions

OEG Live 2023-06-02

APIs

Annotators

chrisaldrich

URL

platform.openai.com/docs/guides/speech-to-text
Nov 2022
community.interledger.org community.interledger.org

Hyperaudio for Conferences — Final Report

1
1. filslo 28 Nov 2022
  
  in Public
  
  🌟 Highlight words as they are spoken (karaoke anybody?). 🌟 Navigate video by clicking on words. 🌟 Share snippets of text (with video attached!). 🌟 Repurpose by remixing using the text as a base and reference.
  
  If I understand it correctly, with hyperaudio, one can also create transcription to somebody else's video or audio when embedded.
  
  In that case, if you add to hyperaudio the annotation capablity of hypothes.is or docdrop, the vision outlined in the article on Global Knowledge Graph is already a reality.
  
  hyperaudio monetization web monetization audio video transcript translation timing captions annotation docdrop roam navigation sharing remixing repurposing conference simultaneous open source open knowledge global graph translate language learning interactive creative commons speech speech to text speech2text ML mobile wordpress plugin lite
Visit annotations in context

Tags

interactive

timing

repurposing

mobile

video

remixing

commons

hyperaudio

wordpress

lite

transcript

ML

navigation

open source

speech to text

simultaneous

translation

graph

learning

conference

creative

annotation

language

web monetization

speech

docdrop

open

monetization

global

translate

plugin

speech2text

audio

sharing

captions

knowledge

roam

Annotators

filslo

URL

community.interledger.org/hyperaudio/hyperaudio-for-conferences-grant-report-2-3c7g
Jul 2020
github.com github.com

westonruter/spoken-word

1
1. TylerRick 29 Jul 2020
  
  in Public
  
  javascript libraries text-to-speech
Visit annotations in context

Tags

javascript libraries

text-to-speech

Annotators

TylerRick

URL

github.com/westonruter/spoken-word
Apr 2020
deepspeech.readthedocs.io deepspeech.readthedocs.io

Python contributed examples — DeepSpeech 0.7.0 documentation

1
1. raj_reddy 25 Apr 2020
  
  in Public
  
  Python contributed examples¶ Mic VAD Streaming¶ This example demonstrates getting audio from microphone, running Voice-Activity-Detection and then outputting text. Full source code available on https://github.com/mozilla/DeepSpeech-examples. VAD Transcriber¶ This example demonstrates VAD-based transcription with both console and graphical interface. Full source code available on https://github.com/mozilla/DeepSpeech-examples.
  
  speech to text machine learning python deepspeech
Visit annotations in context

Tags

python

deepspeech

speech to text

machine learning

Annotators

raj_reddy

URL

deepspeech.readthedocs.io/en/v0.7.0/Python-contrib-Examples.html
deepspeech.readthedocs.io deepspeech.readthedocs.io

Python API Usage example — DeepSpeech 0.7.0 documentation

1
1. raj_reddy 25 Apr 2020
  
  in Public
  
  Python API Usage example Edit on GitHub Python API Usage example¶ Examples are from native_client/python/client.cc. Creating a model instance and loading model¶ 115 ds = Model(args.model) Performing inference¶ 149 150 151 152 153 154 if args.extended: print(metadata_to_string(ds.sttWithMetadata(audio, 1).transcripts[0])) elif args.json: print(metadata_json_output(ds.sttWithMetadata(audio, 3))) else: print(ds.stt(audio)) Full source code
  
  speech to text machine learning deepspeech python
Visit annotations in context

Tags

deepspeech

python

speech to text

machine learning

Annotators

raj_reddy

URL

deepspeech.readthedocs.io/en/v0.7.0/Python-Examples.html
github.com github.com

mozilla/DeepSpeech

1
1. raj_reddy 25 Apr 2020
  
  in Public
  
  DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. NOTE: This documentation applies to the 0.7.0 version of DeepSpeech only. Documentation for all versions is published on deepspeech.readthedocs.io. To install and use DeepSpeech all you have to do is: # Create and activate a virtualenv virtualenv -p python3 $HOME/tmp/deepspeech-venv/ source $HOME/tmp/deepspeech-venv/bin/activate # Install DeepSpeech pip3 install deepspeech # Download pre-trained English model files curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer # Download example audio files curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz tar xvf audio-0.7.0.tar.gz # Transcribe an audio file deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav A pre-trained English model is available for use and can be downloaded using the instructions below. A package with some example audio files is available for download in our release notes.
  
  speech to text machine learning deepspeech mozilla
Visit annotations in context

Tags

deepspeech

mozilla

speech to text

machine learning

Annotators

raj_reddy

URL

github.com/mozilla/DeepSpeech
research.mozilla.org research.mozilla.org

Machine Learning & Open Source Speech-to-text Engine Development Project

1
1. raj_reddy 25 Apr 2020
  
  in Public
  
  Speech & Machine Learning
  
  speech to text machine learning
Visit annotations in context

Tags

speech to text

machine learning

Annotators

raj_reddy

URL

research.mozilla.org/machine-learning/
pypi.org pypi.org

SpeechRecognition

1
1. raj_reddy 25 Apr 2020
  
  in Public
  
  Library for performing speech recognition, with support for several engines and APIs, online and offline. Speech recognition engine/API support: CMU Sphinx (works offline) Google Speech Recognition Google Cloud Speech API Wit.ai Microsoft Bing Voice Recognition Houndify API IBM Speech to Text Snowboy Hotword Detection (works offline) Quickstart: pip install SpeechRecognition. See the “Installing” section for more details. To quickly try it out, run python -m speech_recognition after installing. Project links: PyPI Source code Issue tracker Library Reference The library reference documents every publicly accessible object in the library. This document is also included under reference/library-reference.rst. See Notes on using PocketSphinx for information about installing languages, compiling PocketSphinx, and building language packs from online resources. This document is also included under reference/pocketsphinx.rst.
  
  speech to text
Visit annotations in context

Tags

speech to text

Annotators

raj_reddy

URL

pypi.org/project/SpeechRecognition/
github.com github.com

alphacep/vosk-api

3
1. raj_reddy 25 Apr 2020
  
  in Public
  
  Running the example code with python Run like this: cd vosk-api/python/example wget https://github.com/alphacep/kaldi-android-demo/releases/download/2020-01/alphacep-model-android-en-us-0.3.tar.gz tar xf alphacep-model-android-en-us-0.3.tar.gz mv alphacep-model-android-en-us-0.3 model-en python3 ./test_simple.py test.wav To run with your audio file make sure it has proper format - PCM 16khz 16bit mono, otherwise decoding will not work. You can find other examples of using a microphone, decoding with a fixed small vocabulary or speaker identification setup in python/example subfolder
  
  opensource speech to text vosk
2. raj_reddy 25 Apr 2020
  
  in Public
  
  Vosk is a speech recognition toolkit. The best things in Vosk are: Supports 8 languages - English, German, French, Spanish, Portuguese, Chinese, Russian, Vietnamese. More to come. Works offline, even on lightweight devices - Raspberry Pi, Android, iOS Installs with simple pip3 install vosk Portable per-language models are only 50Mb each, but there are much bigger server models available. Provides streaming API for the best user experience (unlike popular speech-recognition python packages) There are bindings for different programming languages, too - java/csharp/javascript etc. Allows quick reconfiguration of vocabulary for best accuracy. Supports speaker identification beside simple speech recognition.
  
  opensource speech to text vosk
3. raj_reddy 25 Apr 2020
  
  in Public
  
  Kaldi API for offline speech recognition on Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
  
  speech to text opensource vosk
Visit annotations in context

Tags

opensource

vosk

speech to text

Annotators

raj_reddy

URL

github.com/alphacep/vosk-api
www.analyticsvidhya.com www.analyticsvidhya.com

Learn to Build your First Speech-to-Text Model in Python

4
1. raj_reddy 25 Apr 2020
  
  in Public
  
  import all the necessary libraries into our notebook. LibROSA and SciPy are the Python libraries used for processing audio signals. import os import librosa #for audio processing import IPython.display as ipd import matplotlib.pyplot as plt import numpy as np from scipy.io import wavfile #for audio processing import warnings warnings.filterwarnings("ignore") view raw modules.py hosted with ❤ by GitHub View the code on <a href="https://gist.github.com/aravindpai/eb40aeca0266e95c128e49823dacaab9">Gist</a>. Data Exploration and Visualization Data Exploration and Visualization helps us to understand the data as well as pre-processing steps in a better way.
  
  neural networks speech to text machine learning
2. raj_reddy 25 Apr 2020
  
  in Public
  
  TensorFlow recently released the Speech Commands Datasets. It includes 65,000 one-second long utterances of 30 short words, by thousands of different people. We’ll build a speech recognition system that understands simple spoken commands. You can download the dataset from here.
  
  neural networks speech to text machine learning
3. raj_reddy 25 Apr 2020
  
  in Public
  
  In the 1980s, the Hidden Markov Model (HMM) was applied to the speech recognition system. HMM is a statistical model which is used to model the problems that involve sequential information. It has a pretty good track record in many real-world applications including speech recognition. In 2001, Google introduced the Voice Search application that allowed users to search for queries by speaking to the machine. This was the first voice-enabled application which was very popular among the people. It made the conversation between the people and machines a lot easier. By 2011, Apple launched Siri that offered a real-time, faster, and easier way to interact with the Apple devices by just using your voice. As of now, Amazon’s Alexa and Google’s Home are the most popular voice command based virtual assistants that are being widely used by consumers across the globe.
  
  neural networks speech to text
4. raj_reddy 25 Apr 2020
  
  in Public
  
  Learn how to Build your own Speech-to-Text Model (using Python) Aravind Pai, July 15, 2019 Login to Bookmark this article (adsbygoogle = window.adsbygoogle || []).push({}); Overview Learn how to build your very own speech-to-text model using Python in this article The ability to weave deep learning skills with NLP is a coveted one in the industry; add this to your skillset today We will use a real-world dataset and build this speech-to-text model so get ready to use your Python skills!
  
  machine learning speech to text neural networks
Visit annotations in context

Tags

neural networks

speech to text

machine learning

Annotators

raj_reddy

URL

analyticsvidhya.com/blog/2019/07/learn-build-first-speech-to-text-model-python/
realpython.com realpython.com

The Ultimate Guide To Speech Recognition With Python – Real Python

5
1. raj_reddy 25 Apr 2020
  
  in Public
  
  One can imagine that this whole process may be computationally expensive. In many modern speech recognition systems, neural networks are used to simplify the speech signal using techniques for feature transformation and dimensionality reduction before HMM recognition. Voice activity detectors (VADs) are also used to reduce an audio signal to only the portions that are likely to contain speech. This prevents the recognizer from wasting time analyzing unnecessary parts of the signal.
  
  speech to text
2. raj_reddy 25 Apr 2020
  
  in Public
  
  Most modern speech recognition systems rely on what is known as a Hidden Markov Model (HMM). This approach works on the assumption that a speech signal, when viewed on a short enough timescale (say, ten milliseconds), can be reasonably approximated as a stationary process—that is, a process in which statistical properties do not change over time.
  
  speech to text
3. raj_reddy 25 Apr 2020
  
  in Public
  
  The first component of speech recognition is, of course, speech. Speech must be converted from physical sound to an electrical signal with a microphone, and then to digital data with an analog-to-digital converter. Once digitized, several models can be used to transcribe the audio to text.
  
  speech to text
4. raj_reddy 25 Apr 2020
  
  in Public
  
  How speech recognition works, What packages are available on PyPI; and How to install and use the SpeechRecognition package—a full-featured and easy-to-use Python speech recognition library.
  
  python speech to text hmm
5. raj_reddy 25 Apr 2020
  
  in Public
  
  The Ultimate Guide To Speech Recognition With Python
  
  speech to text hmm python
Visit annotations in context

Tags

python

hmm

speech to text

Annotators

raj_reddy

URL

realpython.com/python-speech-recognition/

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL