What it does
Audio Mining analyses German or English-language audio/video files (e.g. content from a TV news show) and returns textual information suitable for indexing (e.g. for search engines). Audio Mining performs speech and speaker segmentation as well as speech recognition in order to render speech into text. The SE delivers segments, speaker identification, characteristic keywords and additional metadata in XML and JSON. Finally, the SE builds an index for multimedia search.
How it works
Audio Mining incorporates state-of-the-art multimedia pattern recognition algorithms such as speech detection, speaker diarisation, speaker recognition and speech recognition. By cascading these algorithms, it automatically obtains a broad spectrum of metadata for media files. This enables users to search for terms, quotations or specific speakers, to browse through archives using content-based recommendation or to obtain media information such as keywords or SRT-compatible subtitles. Audio Mining incorporates a powerful Apache Solr search engine that stores all metadata and makes it available via the provided SOAP/REST interface.
What you get
Audio Mining offers a RESTful API which can be used to convert audiovisual content in German or English into machine-readable text using automatic speech recognition (ASR). Via the API, the technology can be integrated into existing architectures like CMS.