ChatGPT Clone (Multi-Model Support / Cloud Storage / Custom GPTs / TTS / Image Gen)

Overview

This project is a feature-rich, browser-based application designed to interact with multiple AI language models. It provides a user interface inspired by ChatGPT, enabling conversations with various models including:

OpenAI models: gpt-4o (using the /v1/responses API) and o3-mini (using the /v1/chat/completions API)
Google Gemini models: Support for Gemini models via the Google AI API
xAI models: Support for Grok models via the X.AI API

The application now features cloud storage with Supabase, allowing users to create accounts and access their data across devices. All chat history, API settings, and Custom GPT configurations are stored in the Supabase database when logged in, with localStorage as a fallback when offline.

Key features include:

User Authentication: Create accounts, log in, and securely access your data from any device using Supabase authentication.
Custom GPT functionality: Create, edit, and manage personalized GPTs with specific instructions, capabilities (like web search), and knowledge bases (uploaded TXT, MD, PDF files).
Text-to-Speech (TTS): Playback AI responses using OpenAI's gpt-4o-mini-tts model, with customizable instructions (tone, speed) via the settings panel.
Image Generation: Generate images using DALL-E 3 based on user prompts when gpt-4o is the effective model.
Multi-modal Input: Supports text, image uploads (JPG/PNG for gpt-4o), and per-message file attachments (TXT/MD/PDF).

Note: While the application previously stored data only in localStorage, it now offers secure cloud storage through Supabase integration. API keys are stored in the Supabase database when logged in, providing better security than localStorage alone.

Key Features

Multi-Model Support:
- OpenAI Models: Routes requests to OpenAI's /v1/responses API for gpt-4o or /v1/chat/completions API for o3-mini.
- Google Gemini Models: Supports Gemini models through the Google AI API.
- xAI Models: Supports Grok models through the X.AI API.
- Handles streaming responses from all supported API endpoints.
- Intelligently selects the appropriate API based on the model selection.
User Authentication & Cloud Storage:
- User Accounts: Create accounts, sign in, and access your data from any device.
- Secure API Key Storage: API keys are stored in the Supabase database when logged in.
- Cross-Device Access: Access your chats, custom GPTs, and settings from anywhere.
- Offline Fallback: Falls back to localStorage when offline or not logged in.
Input Methods:
- Standard text input with automatic resizing.
- Image upload (JPG/PNG) for gpt-4o prompts (via image icon).
- Per-message file attachments (TXT, MD, PDF) for context injection (via paperclip icon).
Output & Interaction:
- Streaming AI responses displayed incrementally.
- Markdown rendering for formatted AI responses (using marked.js).
- Copy-to-clipboard action for AI message text (or image URL).
- Regenerate response action for AI messages.
- Text-to-Speech (TTS): Playback AI responses using gpt-4o-mini-tts via a speaker icon on each message. TTS instructions (e.g., tone, speed) can be configured in General Settings. Manages audio playback state (loading, playing, stopping).
- Image Generation Display: Renders images generated by DALL-E 3 directly in the chat, along with any revised prompt used by the model.
Custom GPT Management:
- Create, edit, and delete Custom GPT configurations via a dedicated modal accessed from the sidebar.
- Define: Name, Description, Instructions (System Prompt).
- Enable Capabilities: Web Search toggle (requires gpt-4o).
- Upload Knowledge Files: Attach TXT, MD, or PDF files (content read, validated, and stored in the database when logged in).
- Activate Custom GPTs from the sidebar list to tailor conversations. The active GPT's name is displayed in the header.
Persistence (Supabase & localStorage):
- Cloud Storage: When logged in, all data is stored in Supabase database.
- Local Fallback: When not logged in, falls back to localStorage.
- Saves/loads individual chat conversations.
- Automatically saves the current chat when switching contexts (new chat, load chat).
- Dynamically lists saved chats in the sidebar for easy access and deletion.
- Persists Custom GPT configurations, including instructions and knowledge file content.
Settings & Configuration:
- General Settings Modal: Configure API Keys for multiple providers (OpenAI, Google Gemini, xAI).
- Set the default model used when no Custom GPT is active.
- Configure custom instructions for TTS playback.
- Settings are synchronized across devices when logged in.
UI Components:
- Responsive layout with collapsible sidebar.
- Main chat interface with user/AI message bubbles.
- Authentication modal for sign up, sign in, and password reset.
- Separate modals for General Settings and Custom GPT creation/editing.
- Toast notifications for user feedback (success, error, info).
- Input toolbar with buttons/toggles for image upload, file attachment, web search, image generation mode. Button states update based on the effective model.
- Header displays the default model selector (dropdown) and the name of the currently active Custom GPT (if any).

Technology Stack

Frontend: HTML5, CSS3, JavaScript (Vanilla ES Modules)
Markdown Rendering: marked.js (via CDN)
PDF Text Extraction: pdf.js (via CDN)
APIs:
- OpenAI API (/v1/chat/completions, /v1/responses, /v1/audio/speech, /v1/images/generations)
- Google AI API (Gemini models)
- X.AI API (Grok models)
Backend & Authentication: Supabase (PostgreSQL database, Auth, Row Level Security)
Storage:
- Primary: Supabase database (when logged in)
- Fallback: Browser localStorage (when offline or not logged in)

File Structure

.
│
├── index.html          # Main HTML file, structure of the page
│
├── css/
│   ├── base.css        # Base styles, resets, CSS variables
│   ├── components.css  # Styles for individual UI components (buttons, messages, modals, etc.)
│   └── layout.css      # Styles for page structure (header, sidebar, chat area, input area)
│
├── js/
│   ├── main.js         # Main application entry point, initialization logic
│   ├── api.js          # Handles API calls (Chat, Responses, TTS, Image Gen) & routing logic
│   ├── state.js        # Manages active session state (settings, history, active GPT, toggles, etc.)
│   ├── chatStore.js    # Manages persistent CHAT storage (localStorage)
│   ├── parser.js       # Parses Markdown using 'marked', handles streaming text accumulation
│   ├── utils.js        # Utility functions (escapeHTML, copy, base64, file reading, PDF processing, ID generation)
│   │
│   ├── components/     # Modules for specific UI parts
│   │   ├── header.js       # Header logic (default model dropdown, active GPT display, settings button)
│   │   ├── chatInput.js    # Input area, image/file upload, toolbar logic (search, image gen toggles)
│   │   ├── messageList.js  # Message rendering, streaming updates, typing indicator, TTS button logic, history rendering
│   │   ├── notification.js # Displaying temporary toast notifications
│   │   ├── settingsModal.js# General settings modal logic (API key, default model, TTS instructions)
│   │   ├── sidebar.js      # Sidebar logic (visibility, chat & GPT lists, loading/deleting chats/GPTs, new chat)
│   │   └── welcomeScreen.js# Initial welcome screen logic, example prompts
│   │
│   └── customGpt/      # Modules for Custom GPT functionality
│       ├── gptStore.js       # Manages persistent CUSTOM GPT CONFIG storage (localStorage), handles size limits
│       ├── knowledgeHandler.js# Handles processing/validation of knowledge files for the creator modal
│       └── creatorScreen.js  # Logic for the Custom GPT creator/editor modal UI
│
└── README.md           # This file

Core Functionality Breakdown

API Routing (api.js): Determines whether to call Chat Completions (o3-mini), Responses (gpt-4o), Image Generations (DALL-E 3), or TTS (gpt-4o-mini-tts) API based on the effective model and user actions. If a Custom GPT is active, its instructions, knowledge content, and capability settings are retrieved from state.js and injected into the appropriate API request payload. Handles streaming for chat/responses.
State Management (state.js): Holds the current session's data: API key, default model setting, TTS instructions, current chat history, active chat ID (from chatStore), active Custom GPT configuration (loaded from gptStore), staged image/files for the next message, web search toggle state, image generation mode state, last generated image URL, and the previous_response_id for ongoing /v1/responses conversations.
Persistence (chatStore.js, gptStore.js):
- chatStore.js: Handles saving/loading/deleting individual chat histories to/from localStorage. Manages the list of chat metadata.
- gptStore.js: Handles saving/loading/deleting Custom GPT configurations. Stores the entire config, including name, description, instructions, capabilities, and knowledge file content, in localStorage. Includes checks to prevent exceeding typical localStorage size limits (around 5MB). Manages the list of config metadata.
Chat History (chatStore.js, sidebar.js, messageList.js): Regular chats are saved automatically when switching contexts (new chat, load chat). The sidebar lists saved chats, allowing users to load or delete them. messageList.js renders the history from state.js upon loading.
Custom GPTs (customGpt/ modules, sidebar.js, state.js, api.js, header.js):
- creatorScreen.js: Manages the UI modal for creating/editing configs (name, description, instructions, capabilities, knowledge file list). Uses knowledgeHandler.js for file processing and gptStore.js for saving/updating.
- knowledgeHandler.js: Processes uploaded files (TXT, MD, PDF) within the creator modal, validating type/size, reading content, and returning results to creatorScreen.js.
- gptStore.js: Saves/loads/deletes the complete configuration (including file content) in localStorage.
- sidebar.js: Lists available Custom GPTs from gptStore. Handles activation (loads config into state.js, clears chat), edit (opens creatorScreen.js), and delete actions.
- state.js: Stores the currently active Custom GPT configuration object.
- api.js: Injects the active config's instructions and knowledge content into the prompt sent to the OpenAI API.
- header.js: Updates the header display to show the name of the active Custom GPT.
Text-to-Speech (messageList.js, api.js, state.js):
- A "Listen" button (speaker icon) appears on completed AI text messages.
- messageList.js::handleListenClick: Stops previous audio, sets loading state, retrieves custom TTS instructions from state.js.
- Calls api.js::fetchSpeech, passing the message text and instructions to the /v1/audio/speech endpoint (using gpt-4o-mini-tts).
- Receives an audio Blob, creates a URL, and plays it using the browser's Audio API.
- Manages playback state (loading, playing, error, ended) and cleans up resources (stopCurrentAudio).
Image Generation (chatInput.js, api.js, state.js, messageList.js):
- chatInput.js: Toggles image generation mode via button, updates placeholder. On send, checks mode and prompt.
- api.js::fetchImageGeneration: Calls /v1/images/generations (DALL-E 3) with the prompt.
- messageList.js: Renders the generated image and any revised prompt in an AI message bubble. Disables regenerate/listen actions for image messages.
- state.js: Stores the isImageGenerationMode flag and the lastGeneratedImageUrl (used by api.js if the next user message should reference the generated image).
Input Handling (chatInput.js): Manages the text area, image preview/removal (for user uploads), per-message file attachment/preview/removal, and the state/availability of toolbar buttons (Web Search, Image Generation, Image Upload, File Add) based on the effective model (default or Custom GPT). Calls api.routeApiCall on send.
Streaming & Parsing (api.js, parser.js, messageList.js):
- api.js: Reads streaming responses from both API types.
- parser.js: Accumulates raw text (accumulateChunkAndGetEscaped), returning escaped chunks for immediate display. Provides final parsed HTML (parseFinalHtml) using marked.js.
- messageList.js: Creates AI message container, appends escaped chunks via appendAIMessageContent, finalizes with parsed HTML via finalizeAIMessageContent, and sets up action buttons.

Setup

Clone or download this repository.
Open the index.html file in a modern web browser.
- Note: Due to browser security policies (CORS) when loading ES Modules or the pdf.js worker from local files (file://), you must serve the files using a simple local web server. Many tools can do this, e.g.:
  - Using Python: python -m http.server 8000 (or python3 ...) in the project directory.
  - Using Node.js: Install npm install -g serve then run serve . in the project directory.
  - Using VS Code Live Server extension.
- Access the application via http://localhost:8000 (or the appropriate port configured by your server).

Usage

Create an Account or Sign In:
- Click the "Log in" button in the sidebar.
- Create a new account or sign in with your existing credentials.
- Verify your email if creating a new account.
Open Settings: Click the "Settings" button in the sidebar footer or the settings icon in the header.
Enter API Keys:
- Paste your OpenAI API key into the designated field.
- (Optional) Add Google Gemini API key to use Gemini models.
- (Optional) Add X.AI API key to use Grok models.
Set Defaults:
- Choose your preferred default model from the available options.
- (Optional) Enter default instructions for Text-to-Speech playback (e.g., "Speak clearly and calmly.").
Save Settings: Your keys and preferences are stored in the Supabase database when logged in, or locally in the browser when not logged in.
Start Chatting: Type messages in the input box and press Enter or click the send button.
Use Features:
- Model Selection: Choose from OpenAI, Gemini, or Grok models using the dropdown in the header.
- Image Upload: Click the image icon (requires compatible models like gpt-4o).
- File Attachment: Click the paperclip icon to attach TXT/MD/PDF files to the next message.
- Web Search: Click the "Search" button to toggle web search for the next message (requires compatible models).
- Image Generation: Click the "Generate Image" button to toggle image generation mode (requires compatible models). Enter a prompt and send.
- TTS: Click the speaker icon on a completed AI text response to hear it read aloud using your configured instructions.
Manage Chats: Use the sidebar ("Chats" section) to start new chats or load/delete previous conversations. Your chats are synchronized across devices when logged in.
Manage Custom GPTs:
- Create: Click the "+" button in the "Custom GPTs" sidebar section to open the creator modal.
- Configure: Define Name, Description, Instructions. Toggle Capabilities (Web Search). Upload Knowledge Files (TXT, MD, PDF).
- Save: Click "Save". The configuration (including file content) is stored in the database when logged in.
- Activate: Click a saved Custom GPT in the sidebar list. The header will update, and the chat context will reset for this GPT.
- Edit/Delete: Hover over a Custom GPT in the list to reveal Edit (pencil) and Delete (trash) buttons. Edit opens the creator modal pre-filled. Delete prompts for confirmation.
Access Across Devices: When logged in, all your data (chats, custom GPTs, settings) is available on any device where you sign in to your account.

Configuration

API Keys:
- OpenAI API Key: Required for OpenAI models. Entered in the General Settings modal.
- Google Gemini API Key: Optional. Required only if you want to use Gemini models.
- X.AI API Key: Optional. Required only if you want to use Grok models.
- All API keys are stored in the Supabase database when logged in, or in localStorage when not logged in.
Default Model: Selected in the General Settings modal or header dropdown. This is the fallback when no Custom GPT is active.
TTS Instructions: Optional. Entered in the General Settings modal. Affects voice characteristics.
Authentication: User accounts are managed through Supabase Authentication. Email verification is required for new accounts.
AI Voice Disclosure: The application includes notes stating that AI voices are generated by OpenAI, as required by their policy.

Limitations & Known Issues

API Key Security: While API keys are now stored in the Supabase database when logged in (improving security), they are still sent directly from the client to the respective AI providers. For production environments, consider implementing a proxy server.
PDF Processing: Relies on the pdf.js library and its worker script loaded from a CDN. Network issues or CDN changes could affect PDF reading. Requires serving files via HTTP(S) due to browser security restrictions with workers. Password-protected PDFs are not supported.
Model Compatibility: Not all features are available with all models. For example, image upload and generation are only available with certain OpenAI models.
Basic Error Handling: While some API errors are caught, complex issues or unexpected API responses might not be handled gracefully.
Supabase Dependency: The application now depends on Supabase for authentication and data storage. If Supabase is unavailable, the application will fall back to localStorage, but some features may be limited.
Features Not Implemented: Some buttons like "Deep Research", "Voice Input", and "Dark Mode" are placeholders and show an "unimplemented" notification.

Future Enhancements (Ideas)

Add support for more AI models and providers.
Enhance the authentication system with social login options (Google, GitHub, etc.).
Implement data encryption for API keys stored in the database.
Add support for sharing custom GPTs between users.
Implement collaborative chat sessions.
Add Chat Folders for better organization.
Implement UI Themes (e.g., Dark Mode toggle).
Allow selection of different TTS voices.
Implement Voice Input using browser SpeechRecognition API.
More robust error handling and user feedback.
Option to export/import Custom GPT configurations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChatGPT Clone (Multi-Model Support / Cloud Storage / Custom GPTs / TTS / Image Gen)

Overview

Key Features

Technology Stack

File Structure

Core Functionality Breakdown

Setup

Usage

Configuration

Limitations & Known Issues

Future Enhancements (Ideas)

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
css		css
customgpt/css		customgpt/css
js		js
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.html		index.html

License

AppleLamps/GPT

Folders and files

Latest commit

History

Repository files navigation

ChatGPT Clone (Multi-Model Support / Cloud Storage / Custom GPTs / TTS / Image Gen)

Overview

Key Features

Technology Stack

File Structure

Core Functionality Breakdown

Setup

Usage

Configuration

Limitations & Known Issues

Future Enhancements (Ideas)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages