Cellexis is an intelligent research platform that leverages advanced AI technologies to make NASA's bioscience research publications more accessible and discoverable. Using RAG (Retrieval-Augmented Generation), knowledge graphs, and natural language processing, Cellexis enables researchers to explore, query, and analyze complex biological datasets from space missions.
- Semantic Search: AI-powered search across NASA bioscience publications
- Question Answering: Natural language queries with contextual answers
- Citation Tracking: Automatic citation generation with relevance scores
- Multi-document Analysis: Cross-reference findings across multiple papers
- Interactive Network: Dynamic visualization of research relationships
- Entity Recognition: Automatic extraction of biological entities and concepts
- Connection Mapping: Discover hidden relationships between research areas
- Fullscreen Mode: Immersive graph exploration experience
- Paper Comparison: Side-by-side analysis with consensus detection
- Smart Bookmarks: Organized research collection with notes
- Export Options: Multiple formats for research findings
- Voice Assistant: Hands-free navigation and search capabilities
- Responsive Design: Optimized for all device sizes
- Real-time Updates: Live search results and graph updates
- Authentication: Secure Firebase-based user management
- Progressive Web App: Install and use offline capabilities
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Frontend โโโโโโ Backend API โโโโโโ Databases โ
โ React + TS โ โ FastAPI + AI โ โ Neo4j + Vector โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ โ
โ โ โ
v v v
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Web Client โ โ AI Services โ โ File Storage โ
โ Voice Commands โ โ Google Gemini โ โ PDF Documents โ
โ Real-time UI โ โ Embeddings โ โ Vector Index โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
Frontend:
- React 18.3.1 with TypeScript
- Vite build system
- Tailwind CSS + shadcn/ui components
- Firebase Authentication
- Cytoscape.js for graph visualization
- Web Speech API for voice commands
Backend:
- FastAPI with Python 3.8+
- Google Gemini AI for text generation
- Sentence Transformers for embeddings
- FAISS for vector similarity search
- Neo4j graph database
- PDFPlumber for document processing
Infrastructure:
- Frontend: Deployed on Netlify/Vercel
- Backend: Deployed on Render
- Database: Neo4j AuraDB (cloud)
- Authentication: Firebase Auth
- Node.js 16+ and npm/pnpm
- Python 3.8+
- Neo4j Database (local or cloud)
- Google AI Studio API Key
- Firebase Project (for authentication)
git clone https://github.com/your-repo/cellexis.git
cd cellexiscd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env with your configurationRequired Environment Variables:
# API Keys
GOOGLE_API_KEY=your_google_api_key_here
NEO4J_URI=bolt://localhost:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your_password
# Optional
OLLAMA_BASE_URL=http://localhost:11434 # If using Ollamacd ../frontend
# Install dependencies
npm install # or pnpm install
# Set up environment variables
cp .env.example .env.development
# Edit .env.development with your configurationFrontend Environment Variables:
# API Configuration
VITE_API_URL=http://localhost:8000
# Firebase Configuration (from your Firebase project)
VITE_FIREBASE_API_KEY=your_api_key
VITE_FIREBASE_AUTH_DOMAIN=your-project.firebaseapp.com
VITE_FIREBASE_PROJECT_ID=your-project-idOption A: Local Neo4j
# Download and start Neo4j Desktop
# Create a new database with password
# Update NEO4J_* variables in backend/.envOption B: Neo4j AuraDB (Cloud)
# Create account at neo4j.com/cloud/aura
# Create new instance
# Update connection details in backend/.envTerminal 1 - Backend:
cd backend
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000Terminal 2 - Frontend:
cd frontend
npm run devcd backend
# Create embeddings for sample documents
python create_embeddings.py
# Extract and load entities into Neo4j
python extract_entities.pyThe backend requires several services and configurations:
- Go to Google AI Studio
- Create an API key
- Add to
.envasGOOGLE_API_KEY
Local Installation:
# Using Docker
docker run \
--name neo4j \
-p7474:7474 -p7687:7687 \
-d \
-v $HOME/neo4j/data:/data \
-v $HOME/neo4j/logs:/logs \
-v $HOME/neo4j/import:/var/lib/neo4j/import \
-v $HOME/neo4j/plugins:/plugins \
--env NEO4J_AUTH=neo4j/your-password \
neo4j:latestCloud Setup (Recommended):
- Visit Neo4j Aura
- Create free instance
- Download connection details
- Update
.envwith connection URI and credentials
Place your PDF documents in backend/documents/ directory:
backend/
โโโ documents/
โ โโโ NASA_paper_1.pdf
โ โโโ NASA_paper_2.pdf
โ โโโ ...
โโโ create_embeddings.py
โโโ extract_entities.py- Go to Firebase Console
- Create new project
- Enable Authentication
- Enable Email/Password and Google providers
- Get configuration object
- Update frontend
.env.development
The project uses Vite with the following key configurations:
// vite.config.ts
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
"@": path.resolve(__dirname, "./client"),
},
},
server: {
port: 8080,
host: true,
}
})cellexis/
โโโ backend/ # FastAPI backend
โ โโโ app/
โ โ โโโ main.py # FastAPI app entry point
โ โ โโโ rag_system.py # RAG implementation
โ โ โโโ graph_api.py # Neo4j graph operations
โ โ โโโ models.py # Pydantic models
โ โโโ documents/ # PDF documents for processing
โ โโโ create_embeddings.py # Vector embeddings creation
โ โโโ extract_entities.py # Entity extraction for graph
โ โโโ requirements.txt
โโโ frontend/ # React frontend
โ โโโ client/
โ โ โโโ components/ # React components
โ โ โโโ pages/ # Page components
โ โ โโโ lib/ # Utilities and API client
โ โ โโโ contexts/ # React contexts
โ โโโ public/ # Static assets
โ โโโ package.json
โโโ README.md
The backend provides these main endpoints:
POST /search-rag
- Query: { "query": "string", "top_k": number }
- Response: { "query", "answer", "citations", "chunks_used" }
GET /graph
- Optional: ?filter_type=entity_type
- Response: { "nodes": [...], "edges": [...] }
GET /search?q=query
- Response: { "query", "results": [...] }
GET /health # Application health
GET /pingdb # Database connectivity
- Dashboard: Main search and visualization interface
- Login: Firebase authentication
- Features: Feature showcase
- Contact: Contact information
- PaperComparison: Side-by-side paper analysis
- BookmarksNotes: Research organization
- VisualizationEnhancements: Advanced graph features
- ExportShare: Export and sharing functionality
- UserFeedback: User feedback collection
The platform supports voice navigation:
Activation:
- Say "Hey Cellexis" or press Ctrl+Space
- Say "Stop" or press Escape to deactivate
Navigation Commands:
- "Go to search" / "Search"
- "Go to bookmarks" / "Bookmarks"
- "Go to comparison" / "Compare"
- "Go to visualization" / "Visualize"
Panel Commands:
- "Open left panel" / "Close left panel"
- "Open right panel" / "Close right panel"
- "Toggle left" / "Toggle right"
-
Connect Repository
- Link your GitHub repository to Render
- Select the backend directory as root
-
Environment Variables
GOOGLE_API_KEY=your_api_key NEO4J_URI=neo4j+s://your-instance.databases.neo4j.io NEO4J_USERNAME=neo4j NEO4J_PASSWORD=your_password
-
Build Configuration
- Build Command:
pip install -r requirements.txt - Start Command:
uvicorn app.main:app --host 0.0.0.0 --port $PORT
- Build Command:
-
Build Settings
- Build Command:
npm run build - Publish Directory:
dist
- Build Command:
-
Environment Variables
VITE_API_URL=https://your-backend.onrender.com VITE_FIREBASE_API_KEY=your_api_key VITE_FIREBASE_AUTH_DOMAIN=your-project.firebaseapp.com VITE_FIREBASE_PROJECT_ID=your-project-id
Security:
- Use HTTPS for all communications
- Implement rate limiting on API endpoints
- Secure Firebase rules for authentication
- Use environment variables for all secrets
Performance:
- Enable CDN for static assets
- Implement caching strategies
- Monitor API response times
- Optimize bundle sizes
Monitoring:
- Set up error tracking (Sentry)
- Monitor API usage and costs
- Database performance monitoring
- User analytics
// Using the API service
import { apiService } from '@/lib/api';
const results = await apiService.searchRAG(
"How does microgravity affect immune response?",
5
);
console.log(results.answer);
console.log(results.citations);// Load and render knowledge graph
const graphData = await apiService.getGraph();
// Graph is automatically rendered using Cytoscape.js// Voice commands are handled automatically
// Users can activate with "Hey Cellexis" or Ctrl+Space
// Commands like "Go to search" navigate to different tabsWe welcome contributions! Please see our contributing guidelines:
- Fork the repository
- Create feature branch:
git checkout -b feature/amazing-feature - Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open Pull Request
Backend:
- Follow PEP 8 style guidelines
- Add type hints to all functions
- Include docstrings for modules and functions
- Write unit tests for new features
Frontend:
- Use TypeScript strictly
- Follow React best practices
- Use consistent naming conventions
- Add JSDoc comments for complex functions
Backend Tests:
cd backend
pytest tests/Frontend Tests:
cd frontend
npm run testAPI Performance:
- Response times for search endpoints
- RAG query processing time
- Graph generation performance
- Database query optimization
User Experience:
- Page load times
- Search result relevance
- Voice command accuracy
- Mobile responsiveness
Resource Usage:
- API rate limiting
- Database connection pooling
- Memory usage optimization
- Vector index efficiency
- Firebase Authentication with email/password
- JWT token validation on backend
- Protected routes for authenticated users
- Session management and refresh tokens
- CORS configuration for frontend domains
- Input validation and sanitization
- Rate limiting to prevent abuse
- Error handling without information disclosure
- Encrypted connections (HTTPS/TLS)
- Secure environment variable handling
- No sensitive data in client-side code
- Regular security audits
Backend Not Starting:
# Check Python version
python --version # Should be 3.8+
# Verify dependencies
pip install -r requirements.txt
# Check environment variables
cat .envFrontend Build Errors:
# Clear cache
npm cache clean --force
# Reinstall dependencies
rm -rf node_modules package-lock.json
npm install
# Check TypeScript configuration
npm run typecheckDatabase Connection Issues:
# Test Neo4j connection
python -c "from neo4j import GraphDatabase; driver = GraphDatabase.driver('bolt://localhost:7687', auth=('neo4j', 'password')); print('Connected!')"
# Check database status
docker ps # If using DockerAPI Integration Problems:
- Verify CORS settings match frontend domain
- Check API_URL environment variable
- Validate Google API key permissions
- Monitor network requests in browser dev tools
Backend Debug:
# Enable debug logging
export LOG_LEVEL=DEBUG
uvicorn app.main:app --reload --log-level debugFrontend Debug:
# Development mode with source maps
npm run dev
# Check console for errors
# Use React Developer Tools- LangChain - RAG framework
- Sentence Transformers - Text embeddings
- Cytoscape.js - Graph visualization
This project is licensed under the MIT License - see the LICENSE file for details.
- NASA for providing public access to bioscience research data
- Google AI team for Gemini API access
- Neo4j for graph database technology
- Open source community for the amazing tools and libraries
For support and questions:
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
- Documentation: docs.cellexis.com
Built with โค๏ธ for the scientific research community