Media Data Types
This page describes all the Object types defined and used by Media activities.
Transcript
Properties
| Property | Type | Description |
|---|---|---|
| Language | String | The language of the audio. |
| Text | String | The text extracted from the audio. |
| Segments | List<Segment> | Details segments extracted in audio transcription. |
Segment
Properties
| Property | Type | Description |
|---|---|---|
| Id | Int32 | The unique identifier of the segment. |
| Seek | Int32 | The position in the audio file where the segment begins. |
| Start | Double | The segment's start time in the audio file. It provides a specific timestamp indicating when the segment begins. |
| End | Double | The end time of the segment in the audio file. It provides a timestamp indicating when the segment ends. |
| Text | String | the transcribed text corresponding to the spoken content within the segment. Converts the spoken words into written text during audio transcription, and this property stores that textual representation. |
| Tokens | List<Int32> | Details about transcription. |
| Temperature | Double | The intensity or significance of the speech content within the segment, which could help prioritize or categorize segments during transcription. |
| AvgLogprob | Double | The average logarithmic probability associated with the transcription of the segment. |
| CompressionRatio | Double | The degree to which it compresses the audio data. |
| NoSpeechProb | Double | The probability of absence of speech within the segment. |
| Confidence | Double | confidence level or certainty associated with the transcription of the segment. A numerical value indicates the degree of confidence in the accuracy of the transcribed text. |
| Words | List<Word> | A list of individual words extracted from the transcribed text. |
Word
Properties
| Property | Type | Description |
|---|---|---|
| Text | String | The text representation of an individual word within the transcribed segment. |
| Start | Double | The start time of the Word within the audio segment. |
| End | Double | The end time of the Word within the audio segment. |
| Confidence | Double | the confidence level or certainty associated with the transcription of the Word. A numerical value indicates the degree of confidence in the accuracy of transcribing that specific Word. |
TranscriptSearchResult
Properties
| Property | Type | Description |
|---|---|---|
| Search Text | String | The text for the time slots were extracted. |
| Segments | List<FoundSegment> | The segments found for the given text. |
FoundSegment
Properties
| Property | Type | Description |
|---|---|---|
| Id | Int32 | The unique identifier of the segment. |
| Seek | Int32 | The position in the audio file where the segment begins. |
| Start | Double | The segment's start time in the audio file. It provides a specific timestamp indicating when the segment begins. |
| End | Double | The end time of the segment in the audio file. It provides a timestamp indicating when the segment ends. |
| Text | String | the transcribed text corresponding to the spoken content within the segment. The spoken words it converts into written text during audio transcription, and this property stores that textual representation. |
| Words | List<FoundWord> | A list of individual words extracted from the transcribed text. |
FoundWord
Properties
| Property | Type | Description |
|---|---|---|
| Text | String | The text representation of an individual word within the transcribed segment. |
| Start | Double | The start time of the Word within the audio segment. |
| End | Double | The end time of the Word within the audio segment. |