Use Google Text To Speech with Kotlin #

This is a small sample for Google Text To Speech API invocation with Kotlin.

Setup #

I used IntelliJ IDEA but you can choose your own IDE.
I used Java 17.
Language I chose is Kotlin, and build tool as gradle.
I started with a blank kotlin project in IntelliJ, and added the google-cloud-texttospeech dependency.
I also had to setup Google Application Credentials JSON file. Please use the following guide: https://developers.google.com/workspace/guides/create-credentials

Gradle File #

Here is the gradle file for the project

plugins {
    kotlin("jvm") version "1.8.0"
    application
}

java {
    toolchain {
        languageVersion.set(JavaLanguageVersion.of(17))
    }
}

group = "in.developp"
version = "1.0-SNAPSHOT"

repositories {
    mavenCentral()
}

dependencies {
    testImplementation(kotlin("test"))

    // https://mvnrepository.com/artifact/com.google.cloud/google-cloud-texttospeech
    implementation("com.google.cloud:google-cloud-texttospeech:2.15.0")
}

tasks.test {
    useJUnitPlatform()
}

kotlin {
    jvmToolchain(11)
}

tasks.withType(JavaExec::class) {
    environment("GOOGLE_APPLICATION_CREDENTIALS", "<FULLY QUALIFIED PATH TO YOUR CREDENTIALS JSON FILE>")
}

application {
    mainClass.set("MainKt")
}

Please change the path of GOOGLE_APPLICATION_CREDENTIALS

Finally, the code #

The code is basic, and you can add your own exception handling or extend it as needed.

import com.google.cloud.texttospeech.v1.*
import com.google.cloud.texttospeech.v1.AudioEncoding.*
import com.google.protobuf.ByteString
import java.io.FileOutputStream
import java.io.FileReader


fun main(args: Array<String>) {
    val file = args[0]
    val outputPath = args[1]
    val audioConfig: AudioConfig = AudioConfig.newBuilder()
        .setAudioEncoding(/* value = */ MP3)
        .build()

    FileOutputStream(outputPath).use { out ->
        FileReader(file).readLines().forEach { line: String ->
            out.write(synthesizeText(line, audioConfig).toByteArray())
        }
        println("Audio content written to file $outputPath")
    }
}

fun synthesizeText(text: String, audioConfig: AudioConfig): ByteString {
    TextToSpeechClient.create().use { textToSpeechClient ->
        val input: SynthesisInput = SynthesisInput.newBuilder().setText(text).build()
        val voice: VoiceSelectionParams = VoiceSelectionParams.newBuilder()
            .setLanguageCode("en-US")
            .setName("en-US-Neural2-I")
            .setSsmlGender(SsmlVoiceGender.MALE)
            .build()

        val response: SynthesizeSpeechResponse = textToSpeechClient.synthesizeSpeech(input, voice, audioConfig)

        return response.audioContent
    }
}

To invoke this code from gradle, on command line, just use something like this

./gradlew run --args "input.txt output/voice.mp3"