Thanks to visit codestin.com
Credit goes to sourceforge.net

Data Integration Tools for Linux

View 40 business solutions

Browse free open source Data Integration tools and projects for Linux below. Use the toggles on the left to filter open source Data Integration tools by OS, license, language, programming language, and project status.

  • One Platform. Total IT Insight. Start with PRTG Now Icon
    One Platform. Total IT Insight. Start with PRTG Now

    Rely on a single source of truth. PRTG unifies monitoring for all your systems, apps, and services.

    Why settle for fragmented monitoring? PRTG consolidates everything - servers, VMs, network devices, cloud services, and more, into one powerful platform. Get real-time status, customizable alerts, and deep analytics to drive smarter decisions. Designed for complex environments, PRTG scales with your needs, supports team collaboration, and helps you prevent outages before they impact users. Take control of your IT landscape and deliver the uptime your business requires.
    Start Your Free PRTG Trial
  • Identity Matrix AI - #1 in B2B Lead Generation Icon
    Identity Matrix AI - #1 in B2B Lead Generation

    For B2B sales and marketing teams seeking a tool to identify and engage high-intent website visitors

    Identity Matrix transforms anonymous website traffic into detailed visitor profiles with impressive accuracy, offering valuable insights for businesses to enhance marketing and sales strategies. For companies serious about capitalizing on every website visitor, this platform is second to none.
    Learn More
  • 1
    Pentaho

    Pentaho

    Pentaho offers comprehensive data integration and analytics platform.

    Pentaho couples data integration with business analytics in a modern platform to easily access, visualize and explore data that impacts business results. Use it as a full suite or as individual components that are accessible on-premise, in the cloud, or on-the-go (mobile). Pentaho enables IT and developers to access and integrate data from any source and deliver it to your applications all from within an intuitive and easy to use graphical tool. The Pentaho Enterprise Edition Free Trial can be obtained from https://pentaho.com/download/
    Leader badge
    Downloads: 1,385 This Week
    Last Update:
    See Project
  • 2
    Pentaho Data Integration

    Pentaho Data Integration

    Pentaho Data Integration ( ETL ) a.k.a Kettle

    Pentaho Data Integration uses the Maven framework. Project distribution archive is produced under the assemblies module. Core implementation, database dialog, user interface, PDI engine, PDI engine extensions, PDI core plugins, and integration tests. Maven, version 3+, and Java JDK 1.8 are requisites. Use of the Pentaho checkstyle format (via mvn checkstyle:check and reviewing the report) and developing working Unit Tests helps to ensure that pull requests for bugs and improvements are processed quickly. In addition to the unit tests, there are integration tests that test cross-module operation.
    Downloads: 83 This Week
    Last Update:
    See Project
  • 3
    Airbyte

    Airbyte

    Data integration platform for ELT pipelines from APIs, databases

    We believe that only an open-source solution to data movement can cover the long tail of data sources while empowering data engineers to customize existing connectors. Our ultimate vision is to help you move data from any source to any destination. Airbyte already provides the largest catalog of 300+ connectors for APIs, databases, data warehouses, and data lakes. Moving critical data with Airbyte is as easy and reliable as flipping on a switch. Our teams process more than 300 billion rows each month for ambitious businesses of all sizes. Enable your data engineering teams to focus on projects that are more valuable to your business. Building and maintaining custom connectors have become 5x easier with Airbyte. With an average response rate of 10 minutes or less and a Customer Satisfaction score of 96/100, our team is ready to support your data integration journey all over the world.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    Dagster

    Dagster

    An orchestration platform for the development, production

    Dagster is an orchestration platform for the development, production, and observation of data assets. Dagster as a productivity platform: With Dagster, you can focus on running tasks, or you can identify the key assets you need to create using a declarative approach. Embrace CI/CD best practices from the get-go: build reusable components, spot data quality issues, and flag bugs early. Dagster as a robust orchestration engine: Put your pipelines into production with a robust multi-tenant, multi-tool engine that scales technically and organizationally. Dagster as a unified control plane: The ‘single plane of glass’ data teams love to use. Rein in the chaos and maintain control over your data as the complexity scales. Centralize your metadata in one tool with built-in observability, diagnostics, cataloging, and lineage. Spot any issues and identify performance improvement opportunities.
    Downloads: 3 This Week
    Last Update:
    See Project
  • All-in-One Inspection Software Icon
    All-in-One Inspection Software

    flowdit is a connected worker platform tailored for industry needs in commissioning, quality, maintenance, and EHS management.

    Optimize Frontline Operations: Elevate Equipment Uptime, Operational Excellence, and Safety with Connected Teams and Data, Including Issue Capture and Corrective Action.
    Learn More
  • 5
    Recap

    Recap

    Recap tracks and transform schemas across your whole application

    Recap is a schema language and multi-language toolkit to track and transform schemas across your whole application. Your data passes through web services, databases, message brokers, and object stores. Recap describes these schemas in a single language, regardless of which system your data passes through. Recap schemas can be defined in YAML, TOML, JSON, XML, or any other compatible language.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    nichenetr

    nichenetr

    NicheNet: predict active ligand-target links between interacting cells

    nichenetr: the R implementation of the NicheNet method. The goal of NicheNet is to study intercellular communication from a computational perspective. NicheNet uses human or mouse gene expression data of interacting cells as input and combines this with a prior model that integrates existing knowledge on ligand-to-target signaling paths. This allows to predict ligand-receptor interactions that might drive gene expression changes in cells of interest. This model of prior information on potential ligand-target links can then be used to infer active ligand-target links between interacting cells. NicheNet prioritizes ligands according to their activity (i.e., how well they predict observed changes in gene expression in the receiver cell) and looks for affected targets with high potential to be regulated by these prioritized ligands.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    Apache Hudi

    Apache Hudi

    Upserts, Deletes And Incremental Processing on Big Data

    Apache Hudi (pronounced Hoodie) stands for Hadoop Upserts Deletes and Incrementals. Hudi manages the storage of large analytical datasets on DFS (Cloud stores, HDFS or any Hadoop FileSystem compatible storage). Apache Hudi is a transactional data lake platform that brings database and data warehouse capabilities to the data lake. Hudi reimagines slow old-school batch data processing with a powerful new incremental processing framework for low latency minute-level analytics. Hudi provides efficient upserts, by mapping a given hoodie key (record key + partition path) consistently to a file id, via an indexing mechanism. This mapping between record key and file group/file id, never changes once the first version of a record has been written to a file. In short, the mapped file group contains all versions of a group of records.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    PANDORA

    PANDORA

    Revolutionizing Biomedical Research with Advanced Machine Learning

    PANDORA is a machine learning (ML) tool that can be used to integrate various data types, including clinical, transcriptome and microbiome data and find connections in large datasets. PANDORA can be easily installed using Docker, a pre-built version of the software can be pulled from DockerHub. In order to run a test instance of PANDORA, users will first need to prepare their local environment by downloading, installing, and configuring Docker. genular is a community behind SIMON an open-source Machine Learning KnowledgeDiscovery software, built by a vibrant community of people just like you! Join us and make SIMON even cooler! Exploratory analysis of machine learning results with the help of many different visualization techniques will give you instant insights into models and data.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    Apache DevLake

    Apache DevLake

    Apache DevLake is an open-source dev data platform

    Apache DevLake is an open-source dev data platform that ingests, analyzes, and visualizes the fragmented data from DevOps tools to extract insights for engineering excellence, developer experience, and community growth. Apache DevLake is designed for developer teams looking to make better sense of their development process and to bring a more data-driven approach to their own practices. You can ask Apache DevLake many questions regarding your development process. Just connect and query. Your Dev Data lives in many silos and tools. DevLake brings them all together to give you a complete view of your Software Development Life Cycle (SDLC). From DORA to scrum retros, DevLake implements metrics effortlessly with prebuilt dashboards supporting common frameworks and goals. DevLake fits teams of all shapes and sizes, and can be readily extended to support new data sources, metrics, and dashboards, with a flexible framework for data collection and transformation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Transforming NetOps Through No-Code Network Automation - NetBrain Icon
    Transforming NetOps Through No-Code Network Automation - NetBrain

    For anyone searching for a complete no-code automation platform for hybrid network observability and AIOps

    NetBrain, founded in 2004, provides a powerful no-code automation platform for hybrid network observability, allowing organizations to enhance their operational efficiency through automated workflows. The platform applies automation across three key workflows: troubleshooting, change management, and assessment.
    Learn More
  • 10
    nango

    nango

    A single API for all your integrations.

    Nango is a single API to interact with all other external APIs. It should be the only API you need to integrate to your app. Nango is an open-source solution for integrating third-party APIs with applications, simplifying API authentication, data syncing, and management.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Metl ETL Data Integration

    Metl ETL Data Integration

    Simple message-based, web-based ETL integration

    Metl is a simple, web-based ETL tool that allows for data integrations including database, files, messaging, and web services. Supports RDBMS, SOAP, HTTP, FTP, SFTP, XML, FIXLEN, CSV, JSON, ZIP, and more. Metl implements scheduled integration tasks without the need for custom coding or heavy infrastructure. It can be deployed in the cloud or in an internal data center, and it was built to allow developers to extend it with custom components.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    Templates for integrating the data structures of Compiere, Openbravo or ADempiere for all kind of Pentaho Data Integration processes. Later on we plan to migrate these to Talend too.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14

    PDI Data Vault framework

    Data Vault loading automation using Pentaho Data Integration.

    A metadata driven 'tool' to automate loading a designed Data Vault. It consists of a set of Pentaho Data Integration and database objects. Thel Virtual Machine (VMware) is a 64 bit Ubuntu Server 14.04, with MySQL (Percona Server) and PostgreSQL 9.4 as the database flavours and PDI version 5.2 CE. NB: Directory version_2.4 contains the most recent Virtual Machine. The readme.txt contains info about that VM.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    The aim of this project is to publish releases of Pentaho Data Integration not provided by pentaho.org
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    PaloKettlePlugin is for Pentaho Data Integration aka Kettle. It's a Cell Input und Output Step for Palo Molap. The first code was developed by mybiq/3A-Strategy, the PDI-3 version has been developed by Stratebi. Now by 3A-Strategy and Litebi for PDI
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    N-Browse is a client-server package for interactive visualization of network data with heterogeneous types of links, intended for ease of use and designed using a generic database schema for data integration and visualization.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18

    ARSystem plugins for Pentaho Kettle

    AR-System step and db plugins for Pentaho Data Integration Kettle V5

    Allows you to write per API to AR-System Server (BMC Remedy Action Request System). Includes two step output, one step input and one database plugin. The step plugins need the database plugin.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Jaspersoft ETL
    Jaspersoft ETL is a data integration platform providing high performance data extract-transform-load (ETL) capabilities. Jaspersoft ETL is appropriate for all analytic and operational data integration needs. Activity on this project is located at jas
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    bio2rdf
    The Bio2RDF project aims to transforms silos of life science data into a globally distributed network of linked data for biological knowledge discovery. Bio2RDF creates and provides machine understandable descriptions of biological entities using the RDF/RDFS/OWL Semantic Web languages. Using both syntactic and semantic data integration techniques, Bio2RDF seamlessly integrates diverse biological data and enables powerful new SPARQL-based services across its globally distributed knowledge bases.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    The Stem Cell Artificial Neural network project entails the analysis and integration of genomics data for extracting the stemness signature of several tissues by training a multiclass single-layer linear artificial neural network.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Arch Data Integration Framework
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    AWS data tools

    AWS tools for data integration and more

    Here I list some data tools I created for Amazon AWS S3 and Redshift
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Alova.js

    Alova.js

    Workflow-Streamlined next-generation request tools

    Extremely streamline API integration workflow. Quickly find APIs in the editor, and enjoy full type hints even in js projects with the API code automatically generated by Alova's extension. Request in various complex scenes by one line of code. Automatically manage paging data, and data preloading, reduce unnecessary data refresh, improve fluency by 300%, and reduce coding difficulty by 50%. Send requests immediately by watching state changes, useful in tab switching and condition querying. Global interceptor that supports silent token refresh, as well as providing unified management of token-based login, logout, token assignment, and token refresh.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    BD integration

    Heterogeneous BD integration

    The increasing need to obtain a generalized view of the information resources, presented in various systems has led to the data integration mechanisms formation, which focus on efficient access organization to external, heterogeneous data sources through a single interface. The project includes the mass integration platform which allows to create global infrastructure of tens and hundreds of heterogeneous databases based on service-oriented approach.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next