hp.telephony.media
Interface ASRRecorderEvent
- All Superinterfaces:
- ASRConstants, CoderConstants, MediaEvent, RecorderConstants, RecorderEvent, ResourceConstants, ResourceEvent, SpeechDetectorConstants
- public interface ASRRecorderEvent
- extends RecorderEvent, ASRConstants
This class extends the RecorderEvent interface in order to cover
the RTSP and ASR based specific events. The provided getASRResult()
method allows to access the speech recognition result, including number of
alternatives, recognition result alternative, utterance audio data, start point
of utterance, ?
Currently only rule based recognition is supported (i.e. with a grammar), which
must complete the recognition operation when it returns the result.
The result of the recognition is therefore always retrieved in a
ev_FinalRuleResult event type by the getResult() method,
and is a java class ASRResult, which provides the text or tag as defined
in the relevant grammar.
The ASRResult object contains a set of objects implementing the
ResultAlternative interface that contain the different possibilities the
recogniser has found, in order of confidence.
Each ResultAlternative contains the words that have been recognized,
and the rules used to do so, plus other potentially useful information about the
recognition process.
Also in the ASRResult object are methods to retrieve the audio utterance
information (start, end, URI of audio data). This information is not systematically
available, and depends usually on having asked the server to save it during the
processing (parameter "Save-Waveform" = "true" in the optArgs).
The words are retrieved from the ResultAlternative via the
getPhrase() method, which returns them as an array of tokens.
Rule information is retrieved via a getParsedRule() call, which returns
an array of ASRParsedRule objects. Each one represents a top level tag in
the result. ASRParsedRule objects contain the ruleId (if available),
the tag name and the contents of the tag as found.
If there are nested tags, then these are available through a getNonTerminalRule()
call to the relevant ASRParsedRule object.
NOTE: current experience with MRCP / GRXML / NLSML based ASR engines suggests that tag
support is basic, and the order and nesting of rules used is not represented in the
tag structures returned. It is therefore recommended that when creating grxml grammars:
-
Tags have unique names within the grammar
-
Don't rely on nesting or any encapsulation of rules
-
Don't depend on the order of rule execution to be reflected in the order of
ASRParsedRule objects
Code sample:
void onRecorderEvent (RecorderEvent anEvent) {
// Test we had a final word result
if (ASRConstants.ev_FinalWordResult.equals(anEvent.getEventId()) {
ASRResult result = ((ASRRecorderEvent)anEvent).getASRResult();
ResultToken[] recognizedWords = result.getAlternative(0).getPhrase();
System.out.println( "recognized words:");
// print recognized words
for (int i= 0; I< recognizedWords.length;i++)
System.out.println( recognizedWords[i].getFinalizedWord());
}
}
// etc.
- Since:
- OCMP 2.2 with ASR/TTS extensions
| Fields inherited from interface hp.telephony.media.RecorderConstants |
a_Beep, a_CoderTypes, a_FixedBeep, a_Pause, a_SilenceTruncation, ev_Pause, ev_Record, ev_Resume, p_Append, p_BeepFrequency, p_BeepLength, p_Coder, p_CoderTypes, p_EnabledEvents, p_FileFormat, p_FinalTimeoutBehaviour, p_MaxDuration, p_MinDuration, p_SignalTruncationOn, p_SilenceEnergyThreshold, p_SilenceTerminationOn, p_SilenceTerminationThreshold, p_SilenceTruncationOn, p_SilenceTruncationThreshold, p_SpeechDetectionMode, p_StartBeep, p_StartPaused, q_Silence, rtca_Pause, rtca_Resume, rtca_Stop, rtcc_RecordComplete, v_DetectAllOccurences, v_DetectFirstOccurence, v_GSMFormat, v_Inactive, v_RawFormat, v_WavFormat |
| Fields inherited from interface hp.telephony.media.CoderConstants |
p_AMR_SDPelement, p_channels, p_crc, p_G723_annexa, p_G723_bitrate, p_G729_annexb, p_interleaving, p_maxptime, p_modechangeneighbor, p_modechangeperiod, p_modeset, p_octetalign, p_ptime, p_robustsorting, v_ADPCM_16kG726, v_ADPCM_24k, v_ADPCM_32k, v_ADPCM_32kG726, v_ADPCM_32kOKI, v_ADPCM_44k, v_ALawPCM_48k, v_ALawPCM_64k, v_ALawPCM_88k, v_AMR, v_AMR_WB, v_G723_1b, v_G723_53, v_G723_63, v_G723_no_vad, v_G723_yes, v_G729_no_vad, v_G729_yes, v_G729a, v_GSM, v_Linear16Bit_64k, v_Linear8Bit_48k, v_Linear8Bit_64k, v_Linear8Bit_88k, v_MuLawPCM_48k, v_MuLawPCM_64k, v_MuLawPCM_88k |
| Fields inherited from interface hp.telephony.media.SpeechDetectorConstants |
p_BargeIn, p_FinalTimeout, p_InitialTimeout, p_InPromptSensitivityPercent, p_Sensitivity, p_Type, q_EndOfSpeechDetected, q_NoSpeechTimeout, q_SpeechDetected, rtca_PromptDone, rtcc_EndOfSpeechDetected, rtcc_NoSpeechTimeout, rtcc_SpeechDetected, v_HP, v_Nvad, v_SwiEp, v_Telisma |
| Fields inherited from interface hp.telephony.media.ASRConstants |
a_ASSessionAllocation, a_ASSessionCreation, a_AudioService, e_BadGrammar, e_ResourceUnavailable, e_ServerError, e_ServerFailed, ev_DefineGrammar, ev_DefineGrammer, ev_finalRuleResult, ev_FinalRuleResult, ev_FinalWordResult, ev_NonFinalRuleResult, ev_NonFinalWordResult, p_Grammar, p_GrammarId, p_GrammarType, p_GrammarURI, p_Language, p_LoadGrammar, p_NBestListLength, p_NoInputTimeout, p_Optimize, p_PauseOnIntermediateResult, p_RecogTimeout, p_SaveWaveform, p_SensitivityLevel, p_ServerURL, p_SpeakerName, p_SpeechCompleteTimeout, p_SpeechIncompleteTimeout, p_StubResultIsNoMatch, p_StubResultXML, p_StubResultXMLFile, p_TimersStartImmediately, q_VoiceDetected, rtca_ReleaseCurrentASSession, rtcc_VoiceDetected, v_ASStartTime, v_Call, v_FromFirstOperation, v_OnDemand, v_Operation |
getASRResult
public ASRResult getASRResult()
- Returns the result returned by the speech recognizer