xApache Hadoop112s1q
Apache Hadoop is a collection of open-source software utilities that facilitate using a network
of many computers to solve problems involving massive amounts of data and computation. It
provides a software framework for distributed storage and processing of big data using the
MapReduce programming model
Developed by: Apache Software Foundation
Initial release date: December 10, 2011
Written in: Java
Apache Hive
Apache Hive is a data warehouse software project built on top of Apache Hadoop for
providing data query and analysis. Hive gives an SQL-like interface to query data stored
in various databases and file systems that integrate with Hadoop. Wikipedia
License: Apache License 2.0
Stable release: 2.3.0 / July 19, 2017; 13 months ago
Written in: Java
Apache Pig112
Apache Pig is a high-level platform for creating programs that run on Apache Hadoop.
The language for this platform is called Pig Latin. Pig can execute its Hadoop jobs in
MapReduce, Apache Tez, or Apache Spark.Wikipedia
Stable release: v0.17.0 / June 19, 2017; 14 months ago
Operating system: Microsoft Windows, OS X, Linux
Developer(s): Apache Software Foundation, YahooResearch
Initial release: September 11, 2008; 9 years ago
Programming language: Java
Apache Spark
Apache Spark is an open-source distributed general-purpose cluster-computing
framework. Originally developed at the University of California, Berkeley's AMPLab, the
Spark codebase was later donated to the Apache Software Foundation, which has
maintained it since. Wikipedia
Developer(s): Apache Software Foundation, UC BerkeleyAMPLab, Databricks
Stable release: v2.3.1 / June 8, 2018; 2 months ago
License: Apache License 2.0
Operating system: Microsoft Windows, macOS, Linux
Original author: Matei Zaharia
Written in: Scala, Java, Python, R
Apache HBase
HBase is an open-source, non-relational, distributed database modeled after Google's
Bigtable and written in Java. It is developed as part of Apache Software Foundation's
Apache Hadoop project and runs on top of HDFS, providing Bigtable-like capabilities for
Hadoop.Wikipedia
License: Apache License 2.0
Stable release: 1.4.3 / 3 April 2018
Developed by: Apache Software Foundation
Written in: Java
Initial release: March 28, 2008; 10 years ago
MongoDB
MongoDB is a free and open-source cross-platform document-oriented database
program. Classified as a NoSQL database program, MongoDB uses JSON-like
documents with schemata. MongoDB is developed by MongoDB Inc., and is published
under a combination of the GNU Affero General Public License and the Apache
License. Wikipedia
License: Various; see § Licensing
Stable release: 4.0.2 / 29 August 2018; 10 days ago
Preview release: 4.1.2 / 14 August 2018; 25 days ago
Initial release date: 2009
Developed by: MongoDB Inc.
Written in: C++, C, JavaScript
Sqoop
Sqoop is a command-line interface application for transferring data between relational
databases and Hadoop. Wikipedia
Stable release: 1.4.6 / May 11, 2015
Developed by: Apache Software Foundation
Written in: Java
License: Apache License 2.0
Apache ZooKeeper
Apache ZooKeeper is a software project of the Apache Software Foundation. It is
essentially a centralized service for distributed systems to a hierarchical key-value
store, which is used to provide a distributed configuration service, synchronization
service, and naming registry for large distributed systems. Wikipedia
License: Apache License 2.0
Developed by: Apache Software Foundation
Written in: Java
Stable release: 3.4.11 / November 9, 2017
Preview release: 3.5.3-beta / April 17, 2017
Apache Kafka
Apache Kafka is an open-source stream-processing software platform developed by the
Apache Software Foundation, written in Scala and Java. The project aims to provide a
unified, high-throughput, low-latency platform for handling real-time data
feeds. Wikipedia
Developer(s): Apache Software Foundation
License: Apache License 2.0
Stable release: 2.0.0 / July 30, 2018; 37 days ago
Initial release date: January 2011
Written in: Scala, Java
Apache Cassandra
Apache Cassandra is a free and open-source, distributed, wide column store, NoSQL
database management system designed to handle large amounts of data across many
commodity servers, providing high availability with no single point of failure. Wikipedia
License: Apache License 2.0
Original author(s): Avinash Lakshman, Prashant Malik
Stable release: 3.11.3 / August 1, 2018
Developed by: Apache Software Foundation
Written in: Java
Repository: git://git.apache.org/cassandra.git; https://github.com/apache/cassandra
Apache Oozie
Apache Oozie is a server-based workflow scheduling system to manage Hadoop jobs.
Workflows in Oozie are defined as a collection of control flow and action nodes in a
directed acyclic graph. Control flow nodes define the beginning and the end of a
workflow as well as a mechanism to control the workflow execution path.Wikipedia
Platform: Java virtual machine
Stable release: 4.3.0 / 20 June 2017; 14 months ago
Developed by: Apache Software Foundation
Written in: Java
Apache Storm
Apache Storm is a distributed stream processing computation framework written
predominantly in the Clojure programming language. Originally created by Nathan Marz
and team at BackType, the project was open sourced after being acquired by
Twitter. Wikipedia
License: Apache License 2.0
Stable release: 1.0.5 / 15 September 2017
Developer(s): Backtype, Twitter
Programming languages: Java, Clojure
Apache Mahout
Apache Mahout is a project of the Apache Software Foundation to produce free
implementations of distributed or otherwise scalable machine learning algorithms
focused primarily in the areas of collaborative filtering, clustering and classification.
Many of the implementations use the Apache Hadoop platform. Wikipedia
Stable release: 0.13.0 / 17 April 2017
Developed by: Apache Software Foundation
License: Apache 2.0 Licence
Written in: Java, Scala
Amazon EC2
Amazon Elastic Compute Cloud forms a central part of Amazon.com's cloud-computing
platform, Amazon Web Services, by allowing users to rent virtual computers on which to
run their own computer applications. Wikipedia
Initial release: August 25, 2006; 12 years ago (public beta)
Original author(s): Amazon.com, Inc
Developed by: Amazon.com
Operating system: Linux; Microsoft Windows; FreeBSD
License: Proprietary software
Redis
Redis, RE-dis is an open-source in-memory data structure project implementing a
distributed, in-memory key-value database with optional durability. Redis supports
different kinds of abstract data structures, such as strings, lists, maps, sets, sorted sets,
hyperloglogs, bitmaps and spatial indexes. Wikipedia
License: BSD
Written in: ANSI C
Developer(s): Salvatore Sanfilippo
Stable release: 4.0.11 / August 3, 2018; 35 days ago
Initial release date: April 10, 2009
Elastic Search
Elasticsearch is a search engine based on Lucene. It provides a distributed, multitenant-
capable full-text search engine with an HTTP web interface and schema-free JSON
documents. Elasticsearch is developed in Java and is released as open source under
the terms of the Apache License. Wikipedia
License: Apache License 2.0
Initial release: February 8, 2010; 8 years ago
Developer(s): Shay Banon
Stable release: 6.4.0 / August 23, 2018; 9 days ago
Written in: Java
Apache Flume
Apache Flume is a distributed, reliable, and available software for efficiently collecting,
aggregating, and moving large amounts of log data. It has a simple and flexible
architecture based on streaming data flows. Wikipedia
Stable release: 1.8.0 / 4 October 2017
Written in: Java
Platform: Cross-platform
License: Apache License 2.0
Docker
Docker is a computer program that performs operating-system-level virtualization, also
known as "containerization". It was first released in 2013 and is developed by Docker,
Inc. Docker is used to run software packages called "containers". Wikipedia
License: Binaries: Freemium software as a service; Source code: Apache License 2.0
Initial release: 13 March 2013; 5 years ago
Stable release: 18.06.1-ce / 22 August 2018; 13 days ago
Written in: Go
Operating system: Linux, Windows, macOS
Platforms: x86-64, ARM architecture
Kubernetes
Kubernetes, pronounced "Kū-bər-NəT-ēz" is an open-source container-orchestration
system for automating deployment, scaling and management of containerized
applications. It was originally designed by Google and is now maintained by the Cloud
Native Computing Foundation. Wikipedia
License: Apache License 2.0
Initial release: 7 June 2014; 4 years ago
Written in: Go
Stable release: 1.11.2 / August 8, 2018; 32 days ago
Developed by: Linux Foundation
Open Stack
OpenStack is a free and open-source software platform for cloud computing, mostly
deployed as infrastructure-as-a-service, whereby virtual servers and other resources
are made available to customers
License: Apache License 2.0
Stable release: Rocky (2018.08.30) / 30 August 2018; 5 days ago
Initial release: 21 October 2010; 7 years ago
Written in: Python
Ansible
Ansible is open source software that automates software provisioning, configuration
management, and application deployment. Ansible connects via SSH, remote
PowerShell or via other remote APIs. Wikipedia
Written in: Python, PowerShell
Puppet
In computing, Puppet is an open-source software configuration management tool. It
runs on many Unix-like systems as well as on Microsoft Windows, and includes its own
declarative language to describe system configuration. Puppet is produced by Puppet,
founded by Luke Kanies in 2005. Wikipedia
Stable release: 5.5.3 (July 17, 2018; 40 days ago)
Preview release: 5.5.0.134.g0251aa7 (March 31, 2018; 4 months ago)
Operating system: Linux, Unix-like, Microsoft Windows
Developed by: Puppet
License: Apache for >2.7.0, GPL for prior versions
Programming languages: Ruby, C++, Clojure
Chef
Chef is both the name of a company and the name of a configuration management tool
written in Ruby and Erlang. It uses a pure-Ruby, domain-specific language for writing
system configuration "recipes". Wikipedia
Developed by: Chef
Written in: Ruby
License: Apache License 2.0
Stable release: : 2018-06-07 14.2.0 (client), 2018-02-22 12.17.33 (server);
Operating system: GNU/Linux, AT&T Unix, MS Windows, FreeBSD, Mac OS X, IBM
AIX illumos
Initial release: January 2009; 9 years ago
Vagrant
Vagrant is an open-source software product for building and maintaining portable virtual
software development environments, e.g. for VirtualBox, Hyper-V, Docker containers,
VMware, and AWS which try to simplify software configuration management of
virtualizations in order to increase development productivity. Wikipedia
License: MIT License
Developed by: HashiCorp (Mitchell Hashimoto and John Bender)
Written in: Ruby
Stable release: 2.1.1 / May 7, 2018
Operating system: Linux, FreeBSD, macOS, and Microsoft Windows
Initial release: March 8, 2010; 8 years ago
Open Shift
OpenShift can refer to OpenShift Origin, RedHat OpenShift Online, Openshift
Dedicated, OpenShift Container Platform or OpenShift.io. OpenShift Origin is a
computer software product from Red Hat for container-based software deployment and
management. Wikipedia
License: Apache License 2.0
Stable release: 3.9 / March 2018; 6 months ago
Initial release: May 4, 2011; 7 years ago
Developed by: Red Hat Software
Operating system: Red Hat Enterprise Linux or Container Linux by CoreOS
Written in: Go, AngularJS