Thanks to visit codestin.com
Credit goes to github.com

Skip to content
/ maxun Public
forked from getmaxun/maxun

Open-Source No Code Web Data Extraction Platform • Turn Websites To APIs & Spreadsheets In Minutes!

License

Notifications You must be signed in to change notification settings

Mu-L/maxun

 
 

Repository files navigation

Open-Source No-Code Web Data Extraction Platform

Maxun lets you train a robot in 2 minutes and scrape the web on auto-pilot. Web data extraction doesn't get easier than this!

Go To App | Documentation | Website | Discord | Twitter | Watch Tutorials

getmaxun%2Fmaxun | Trendshift

maxun_gif

Getting Started

The simplest & fastest way to get started is to use the hosted version: https://app.maxun.dev. Maxun Cloud deals with anti-bot detection, huge proxy network with automatic proxy rotation, and CAPTCHA solving.

Local Installation

  1. Create a root folder for your project (e.g. 'maxun')
  2. Create a file named .env in the root folder of the project
  3. Example env file can be viewed here. Copy all content of example env to your .env file.
  4. Choose your installation method below

Docker Compose

  1. Copy paste the docker-compose.yml file into your root folder
  2. Ensure you have setup the .env file in that same folder
  3. Run the command below from a terminal
docker-compose up -d

You can access the frontend at http://localhost:5173/ and backend at http://localhost:8080/

Without Docker

  1. Ensure you have Node.js, PostgreSQL, MinIO and Redis installed on your system.
  2. Run the commands below
git clone https://github.com/getmaxun/maxun

# change directory to the project root
cd maxun

# install dependencies
npm install

# change directory to maxun-core to install dependencies
cd maxun-core 
npm install

# get back to the root directory
cd ..

# install chromium and its dependencies
npx playwright install --with-deps chromium

# get back to the root directory
cd ..

# start frontend and backend together
npm run start

You can access the frontend at http://localhost:5173/ and backend at http://localhost:8080/

Environment Variables

  1. Create a file named .env in the root folder of the project
  2. Example env file can be viewed here.
Variable Mandatory Description If Not Set
BACKEND_PORT Yes Port to run backend on. Needed for Docker setup Default value: 8080
FRONTEND_PORT Yes Port to run frontend on. Needed for Docker setup Default value: 5173
BACKEND_URL Yes URL to run backend on. Default value: http://localhost:8080
VITE_BACKEND_URL Yes URL used by frontend to connect to backend Default value: http://localhost:8080
PUBLIC_URL Yes URL to run frontend on. Default value: http://localhost:5173
VITE_PUBLIC_URL Yes URL used by backend to connect to frontend Default value: http://localhost:5173
JWT_SECRET Yes Secret key used to sign and verify JSON Web Tokens (JWTs) for authentication. JWT authentication will not work.
DB_NAME Yes Name of the Postgres database to connect to. Database connection will fail.
DB_USER Yes Username for Postgres database authentication. Database connection will fail.
DB_PASSWORD Yes Password for Postgres database authentication. Database connection will fail.
DB_HOST Yes Host address where the Postgres database server is running. Database connection will fail.
DB_PORT Yes Port number used to connect to the Postgres database server. Database connection will fail.
ENCRYPTION_KEY Yes Key used for encrypting sensitive data (proxies, passwords). Encryption functionality will not work.
SESSION_SECRET No A strong, random string used to sign session cookies Uses default secret. Recommended to define your own session secret to avoid session hijacking.
MINIO_ENDPOINT Yes Endpoint URL for MinIO, to store Robot Run Screenshots. Connection to MinIO storage will fail.
MINIO_PORT Yes Port number for MinIO service. Connection to MinIO storage will fail.
MINIO_CONSOLE_PORT No Port number for MinIO WebUI service. Needed for Docker setup. Cannot access MinIO Web UI.
MINIO_ACCESS_KEY Yes Access key for authenticating with MinIO. MinIO authentication will fail.
GOOGLE_CLIENT_ID No Client ID for Google OAuth. Used for Google Sheet integration authentication. Google login will not work.
GOOGLE_CLIENT_SECRET No Client Secret for Google OAuth. Used for Google Sheet integration authentication. Google login will not work.
GOOGLE_REDIRECT_URI No Redirect URI for handling Google OAuth responses. Google login will not work.
AIRTABLE_CLIENT_ID No Client ID for Airtable, used for Airtable integration authentication. Airtable login will not work.
AIRTABLE_REDIRECT_URI No Redirect URI for handling Airtable OAuth responses. Airtable login will not work.
MAXUN_TELEMETRY No Disables telemetry to stop sending anonymous usage data. Keeping it enabled helps us understand how the product is used and assess the impact of any new changes. Please keep it enabled. Telemetry data will not be collected.

How Do I Self-Host?

Checkout community self hosting guide: https://docs.maxun.dev/self-host

How Does It Work?

Maxun lets you create custom robots which emulate user actions and extract data. A robot can perform any of the actions: Capture List, Capture Text or Capture Screenshot. Once a robot is created, it will keep extracting data for you without manual intervention

Screenshot 2024-10-23 222138

1. Robot Actions

  1. Capture List: Useful to extract structured and bulk items from the website. Example: Scrape products from Amazon etc.
  2. Capture Text: Useful to extract individual text content from the website.
  3. Capture Screenshot: Get fullpage or visible section screenshots of the website.

Features

  • ✨ Extract Data With No-Code
  • ✨ Handle Pagination & Scrolling
  • ✨ Run Robots On A Specific Schedule
  • ✨ Turn Websites to APIs
  • ✨ Turn Websites to Spreadsheets
  • ✨ Adapt To Website Layout Changes
  • ✨ Extract Behind Login
  • ✨ Integrations
  • ✨ MCP

Use Cases

Maxun can be used for various use-cases, including lead generation, market research, content aggregation and more. View use-cases in detail here: https://www.maxun.dev/#usecases

Screenshots

Maxun PH Launch (1)-1-1 Maxun PH Launch (1)-2-1 Maxun PH Launch (1)-3-1 Maxun PH Launch (1)-4-1 Maxun PH Launch (1)-5-1 Maxun PH Launch (1)-6-1 Maxun PH Launch (1)-7-1 Maxun PH Launch (1)-8-1 Maxun PH Launch (1)-9-1

Note

This project is in early stages of development. Your feedback is very important for us - we're actively working on improvements.

License

This project is licensed under AGPLv3.

Support Us

Star the repository, contribute if you love what we’re building, or sponsor us.

Contributors

Thank you to the combined efforts of everyone who contributes!

About

Open-Source No Code Web Data Extraction Platform • Turn Websites To APIs & Spreadsheets In Minutes!

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • TypeScript 96.5%
  • JavaScript 3.3%
  • Other 0.2%