Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Text to speech #6

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Feb 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 29 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,21 @@
# Intelligent Java
[![Maven Central](https://img.shields.io/maven-central/v/io.github.barqawiz/intellijava.core?style=for-the-badge)](https://central.sonatype.com/artifact/io.github.barqawiz/intellijava.core/0.6.2)
![GitHub](https://img.shields.io/github/license/Barqawiz/IntelliJava?style=for-the-badge)
[![Maven Central](https://img.shields.io/maven-central/v/io.github.barqawiz/intellijava.core?style=for-the-badge)](https://central.sonatype.com/artifact/io.github.barqawiz/intellijava.core/0.7.0)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/Barqawiz/IntelliJava?style=for-the-badge)](https://github.com/Barqawiz/IntelliJava/releases)
[![GitHub](https://img.shields.io/github/license/Barqawiz/IntelliJava?style=for-the-badge)](https://opensource.org/licenses/Apache-2.0)


Intelligent java (IntelliJava) is the ultimate tool for Java developers looking to integrate with the latest language models and deep learning frameworks. The library provides a simple and intuitive API with convenient methods for sending text input to models like GPT-3 and DALL·E, and receiving generated text or images in return. With just a few lines of code, you can easily access the power of cutting-edge AI models to enhance your projects.
Intelligent java (IntelliJava) is the ultimate tool for Java developers looking to integrate with the latest language models and deep learning frameworks. The library provides a simple and intuitive API with convenient methods for sending text input to models like GPT-3 and DALL·E, and receiving generated text, speech or images in return. With just a few lines of code, you can easily access the power of cutting-edge AI models to enhance your projects.

The supported models:
- **OpenAI**: Access GPT-3 to generate text and DALL·E to generate images. OpenAI is preferred when you want quality results without tuning.
- **Cohere.ai**: Generate text; Cohere allows you to generate your language model to suit your specific needs.
- **Google AI**: Generate audio from text; Access DeepMind’s speech models.

# How to use

1. Import the core jar file OR maven dependency (check the Integration section).
2. Add Gson dependency if using the jar file; otherwise, it's handled by maven or Gradle.
3. Call the ``RemoteLanguageModel`` for the language models and ``RemoteImageModel`` for image generation.
3. Call the ``RemoteLanguageModel`` for the language models, ``RemoteImageModel`` for image generation and ``RemoteSpeechModel`` for text to speech models.

## Integration
The package released to Maven Central Repository:
Expand All @@ -23,25 +25,25 @@ Maven:
<dependency>
<groupId>io.github.barqawiz</groupId>
<artifactId>intellijava.core</artifactId>
<version>0.6.2</version>
<version>0.7.0</version>
</dependency>
```

Gradle:

```
implementation 'io.github.barqawiz:intellijava.core:0.6.2'
implementation 'io.github.barqawiz:intellijava.core:0.7.0'
```

Gradle(Kotlin):
```
implementation("io.github.barqawiz:intellijava.core:0.6.2")
implementation("io.github.barqawiz:intellijava.core:0.7.0")
```

Jar download:
[intellijava.jar](https://repo1.maven.org/maven2/io/github/barqawiz/intellijava.core/0.6.2/intellijava.core-0.6.2.jar).
[intellijava.jar](https://repo1.maven.org/maven2/io/github/barqawiz/intellijava.core/0.7.0/intellijava.core-0.7.0.jar).

For ready integration: try the [sample_code](https://github.com/Barqawiz/IntelliJava/tree/main/sample_code).
For ready integration: [try the sample_code](https://github.com/Barqawiz/IntelliJava/tree/main/sample_code).

## Code Example
**Language model code** (2 steps):
Expand Down Expand Up @@ -69,31 +71,30 @@ List<String> images = imageModel.generateImages(imageInput);
```
Output:<br>
<img src="images/response_image.png" height="220px">
<br><br>
**Text to speech code** (2 steps):
```java
// 1- initiate the remote speech model
RemoteSpeechModel model = new RemoteSpeechModel(apiKey, SpeechModels.google);

// 2- call generateEnglishText with any text
SpeechInput input = new SpeechInput.Builder("Hi, I am Intelligent Java.").build();
byte[] decodedAudio = model.generateEnglishText(input);
```
Output:<br>
```Java
// save temporary audio file for testing
AudioHelper.saveTempAudio(decodedAudio);
```

For full example check the code inside sample_code project.

## Third-party dependencies
The only dependencies is **GSON**.
*Required to add manually when using IntelliJava jar. However, if you imported this repo through Maven, it will handle the dependencies.*

For Maven:
```
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.9</version>
</dependency>
```

For Gradle:
```
dependencies {
implementation 'com.google.code.gson:gson:2.8.9'
}
```

For jar download:
[gson download repo](https://search.maven.org/artifact/com.google.code.gson/gson/2.8.9/jar)
[gson download repo](https://search.maven.org/artifact/com.google.code.gson/gson/2.10.1/jar)

## Documentation
[Go to Java docs](https://barqawiz.github.io/IntelliJava/javadocs/)
Expand All @@ -105,12 +106,11 @@ Call for contributors:
- [ ] Add support to other OpenAI functions.
- [x] Add support to cohere generate API.
- [ ] Add support to Google language models.
- [x] Add support to Google speech models.
- [ ] Add support to Amazon language models.
- [ ] Add support to Azure models.
- [ ] Add support to Azure nlp models.
- [ ] Add support to Midjourney image generation.
- [ ] Add support to WuDao 2.0 model.
- [ ] Add support to an audio model.


# License
Apache License
Expand Down
4 changes: 2 additions & 2 deletions core/com.intellijava.core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

<groupId>io.github.barqawiz</groupId>
<artifactId>intellijava.core</artifactId>
<version>0.6.3</version>
<version>0.7.0</version>

<name>Intellijava</name>
<description>IntelliJava allows java developers to easily integrate with the latest language models, image generation, and deep learning frameworks.</description>
Expand Down Expand Up @@ -66,7 +66,7 @@
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.9</version>
<version>2.10.1</version>
</dependency>
</dependencies>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ public String generateText(LanguageModelInput langInput) throws IOException {
langInput.getPrompt(), langInput.getTemperature(),
langInput.getMaxTokens(), langInput.getNumberOfOutputs()).get(0);
} else {
throw new IllegalArgumentException("This version support openai keyType only");
throw new IllegalArgumentException("the keyType not supported");
}

}
Expand Down Expand Up @@ -185,7 +185,7 @@ public List<String> generateMultiText(LanguageModelInput langInput) throws IOExc
langInput.getPrompt(), langInput.getTemperature(),
langInput.getMaxTokens(), langInput.getNumberOfOutputs());
} else {
throw new IllegalArgumentException("This version support openai keyType only");
throw new IllegalArgumentException("the keyType not supported");
}

}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
/**
* Copyright 2023 Github.com/Barqawiz/IntelliJava
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.intellijava.core.controller;

import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import com.intellijava.core.model.AudioResponse;
import com.intellijava.core.model.SpeechModels;
import com.intellijava.core.model.input.SpeechInput;
import com.intellijava.core.model.input.SpeechInput.Gender;
import com.intellijava.core.utils.AudioHelper;
import com.intellijava.core.wrappers.GoogleAIWrapper;

/**
* RemoteSpeechModel class provides a remote speech model implementation.
* It generates speech from text using the Wrapper classes.
*
* This version support google speech models only.
*
* To use Google speech services:
* 1- Go to console.cloud.google.com.
* 2- Enable "Cloud Text-to-Speech API".
* 3- Generate API key from "Credentials" page.
*
* @author github.com/Barqawiz
*/
public class RemoteSpeechModel {

private SpeechModels keyType;
private GoogleAIWrapper wrapper;

/**
*
* Constructs a new RemoteSpeechModel object with the specified key value and key type string.
* If keyTypeString is empty, it is set to "google" by default.
*
* @param keyValue the API key value to use.
* @param keyTypeString the string representation of the key type.
*/
public RemoteSpeechModel(String keyValue, String keyTypeString) {

if (keyTypeString.isEmpty()) {
keyTypeString = SpeechModels.google.toString();
}

List<String> supportedModels = this.getSupportedModels();


if (supportedModels.contains(keyTypeString)) {
this.initiate(keyValue, SpeechModels.valueOf(keyTypeString));
} else {
String models = String.join(" - ", supportedModels);
throw new IllegalArgumentException("The received keyValue not supported. Send any model from: " + models);
}
}

/**
*
* Constructs a new RemoteSpeechModel object with the specified key value and key type.
*
* @param keyValue The API key value to use.
* @param keyType The SpeechModels enum value representing the key type.
*/
public RemoteSpeechModel(String keyValue, SpeechModels keyType) {
this.initiate(keyValue, keyType);
}

/**
* Initiate the object with the specified key value and key type.
*
* @param keyValue the API key value to use.
* @param keyType the SpeechModels enum value representing the key type.
*/
private void initiate(String keyValue, SpeechModels keyType) {

this.keyType = keyType;
wrapper = new GoogleAIWrapper(keyValue);
}

/**
* Get a list of supported key type models.
*
* @return list of the supported SpeechModels enum values.
*/
public List<String> getSupportedModels() {
SpeechModels[] values = SpeechModels.values();
List<String> enumValues = new ArrayList<>();

for (int i = 0; i < values.length; i++) {
enumValues.add(values[i].name());
}

return enumValues;
}

/**
* Generates speech from text using the support models.
*
* You can save the returned byte to audio file using FileOutputStream("path/audio.mp3").
*
* @param input SpeechInput object containing the text and gender to use.
* @return byte array of the decoded audio content.
* @throws IOException in case of communication error.
*/
public byte[] generateEnglishText(SpeechInput input) throws IOException {

if (this.keyType == SpeechModels.google) {
return this.generateGoogleText(input.getText(), input.getGender(), "en-gb");
} else {
throw new IllegalArgumentException("the keyType not supported");
}
}

/**
* Generates speech from text using the Google Speech service API.
*
* @param text text to generate the speech.
* @param gender gender to use (male or female).
* @param language en-gb.
* @return
* @throws IOException in case of communication error.
*/
private byte[] generateGoogleText(String text, Gender gender, String language) throws IOException {
byte[] decodedAudio = null;

Map<String, Object> params = new HashMap<>();
params.put("text", text);
params.put("languageCode", language);

if (gender == Gender.FEMALE) {
params.put("name", "en-GB-Standard-A");
params.put("ssmlGender", "FEMALE");
} else {
params.put("name", "en-GB-Standard-B");
params.put("ssmlGender", "MALE");
}

AudioResponse resModel = (AudioResponse) wrapper.generateSpeech(params);
decodedAudio = AudioHelper.decode(resModel.getAudioContent());

return decodedAudio;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
/**
* Copyright 2023 Github.com/Barqawiz/IntelliJava
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.intellijava.core.model;

import com.google.gson.annotations.SerializedName;

/**
*
* AudioResponse represents the response from the speech API that contains the audio content.
*
* @author github.com/Barqawiz
*
*/
public class AudioResponse extends BaseRemoteModel {

/**
* Default AudioResponse constructor.
*/
public AudioResponse() {}

/**
* The audio content generated from a text.
*/
@SerializedName("audioContent")
private String audioContent;

/**
* Gets the audio content generated from a text.
* @return audio content as a base64 string.
*/
public String getAudioContent() {
return audioContent;
}

/**
* Sets the audio content generated from a text.
*
* @param audioContent audio content as a base64 string.
*/
public void setAudioContent(String audioContent) {
this.audioContent = audioContent;
}

}
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
package com.intellijava.core.model;

/**
* Supported speech models.
*
* @author github.com/Barqawiz
*
*/
public enum SpeechModels {
/** google model */google
}
Loading