xAI's new Grok speech agent has become the leading speech-to-speech model, outperforming Gemini 2.5 Flash Native Audio and GPT Realtime in our Big Bench Audio benchmark. The model achieved a 92.3% score in the Big Bench Audio test, slightly ahead of the previous leader, Google's Gemini 2.5 Flash Native Audio Thinking. This model is xAI's first publicly available speech-to-speech API, bringing more competition to the field. The model supports tool calls, and xAI states it is ready for use in voice assistants, phone agents, and interactive voice applications. Benchmark Background: Big Bench Audio is the first dataset specifically designed to evaluate the inference performance of speech models. Big Bench Audio contains 1000 audio questions adapted from the Big Bench Hard Text test set, selected for its rigorous testing of advanced inference capabilities and its application in the audio domain. Performance: ➤ Inference: Achieved a 92.3% score in the Big Bench Audio test, setting a new benchmark for native speech-to-speech inference. Congratulations to @xai and @elonmusk on the release of this impressive product! ➤ Latency: Average first token response time of 0.78 seconds, ranking third on our leaderboard, behind only Google's Gemini 2.5 Flash Native Audio Dialog and Gemini 2.5 Flash Live. ➤ Pricing: Simple pricing model, 5 cents per minute for connection or $3 per hour for audio. Key Features: ➤ Tool Calls: Use built-in tools such as web search, RAG-based search, or define your own tools using JSON schemas. ➤ Telephony: Connect to Session Initiation Protocol (SIP) providers such as Twilio and Vonage. ➤ Multilingual: Supports over 100 languages and offers 5 voice options.
Risk and Disclaimer:The content shared by the author represents only their personal views and does not reflect the position of CoinWorldNet (币界网). CoinWorldNet does not guarantee the truthfulness, accuracy, or originality of the content. This article does not constitute an offer, solicitation, invitation, recommendation, or advice to buy or sell any investment products or make any investment decisions
No Comments
edit
comment
collection43
like49
share