JA STS Arena

🎤 Speech-to-Text Model Comparison

Input Japanese speech by uploading an audio file or recording with your microphone.
Select a target translation language.
The system will process your speech with two different speech-to-text models.
Compare the results side by side and vote for the better translation.
Listen to both translations spoken by Kotoba TTS.

Note: It may take up to 30 seconds to process your speech and synthesize audio.

Record your speech (Japanese)

Or upload an audio file

Target Language

Model A Translation

Model B Translation

🏆 Leaderboard

Vote to help the community determine the best Japanese speech-to-text (STT) models.

The leaderboard displays models in descending order of translation quality (based on votes cast by the community).

Important: In order to help keep results fair, the leaderboard hides results by default until the number of votes passes a threshold. Tick the Reveal preliminary results to show models without sufficient votes. Please note that preliminary results may be inaccurate.

Show all models, including models with very few human ratings.

Reveal preliminary results

📄 About

The Japanese STS Translation Arena allows users to compare different speech-to-text models for Japanese. It is inspired by TTS Arena.

Features

This application allows users to:

Input Japanese speech through microphone recording or file upload
Compare two different speech recognition models side by side
Select a target language for translation
Vote for the better translation
Listen to the translated text spoken by Kotoba TTS

Credits

Thank you to the open-source code from TTS Arena which helped make this project possible.

Request a model

Please create a Discussion to request a model.

Privacy statement

We may store speech you input, recognized text, and generated audio. We store a unique ID for each session. You agree that we may collect, share, and/or publish any data you input for research and/or commercial purposes.

License

Generated audio clips cannot be redistributed and may be used for personal, non-commercial use only.

Japanese STS Translation Arena: Speech-to-Text Model Comparison