Apache Spark

I Use This!

Activity Not Available

Analyzed 4 months ago. based on code collected 5 months ago.

Project Summary

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write.

To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with disk-based systems like Hadoop.

To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells.

Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source.

In a Nutshell, Apache Spark...

...
has had 43,482 commits made by 2,982 contributors
representing 1,425,405 lines of code
...
is mostly written in Scala
with an average number of source code comments
...
has a well established, mature codebase
maintained by a very large development team
with stable Y-O-Y commits
...
took an estimated 409 years of effort (COCOMO model)
starting with its first commit in March, 2010
ending with its most recent commit 5 months ago

Quick Reference

Organization:

Apache Software Foundation

Project Links:

Homepage
Community
Documentation
Download
Issue Trackers
Mailing Lists

Code Locations:

https://github.com/apache/spark

Similar Projects:

Managers:

andykonwinski, rxin, and matei

Licenses

Apache License 2.0

Permitted

Commercial Use

Modify

Distribute

Place Warranty

Sub-License

Private Use

Use Patent Claims

Forbidden

Hold Liable

Use Trademarks

Required

Include Copyright

State Changes

Include License

Include Notice

These details are provided for information only. No information here is legal advice and should not be used as such.

All Licenses

Project Security

Vulnerabilities per Version ( last 10 releases )

Project Vulnerability Report

Security Confidence Index

Poor security track-record

Favorable security track-record

Vulnerability Exposure Index

Many reported vulnerabilities

Few reported vulnerabilities

About Project Vulnerability Report

Did You Know...

...
use of OSS increased in 65% of companies in 2016
...
check out hot projects on the Open Hub
...
nearly 1 in 3 companies have no process for identifying, tracking, or remediating known open source vulnerabilities
...
learn about Open Hub updates and features on the Open Hub blog

About Project Security

Code

Lines of Code

Activity

Commits per Month

Community

Contributors per Month

Languages

Scala	67%
Python	19%
Java	8%
12 Other	6%

30 Day Summary

Jan 2 2025 — Feb 1 2025

342 Commits
87 Contributors
including 12 new contributors

12 Month Summary

Feb 1 2024 — Feb 1 2025

3747 Commits
Down -308 (7%) from previous 12 months
307 Contributors
Down -24 (7%) from previous 12 months

Most Recent Contributors

	Dongjoon Hyun		Herman van Hovell
	yangjie01		Wei Guo
	Milan Dankovic		zhipeng.mao

Ratings

8 users rate this project:

5.0/5.0

Click to add your rating

Review this Project!

Apache Spark

Project Summary

Tags

In a Nutshell, Apache Spark...

Quick Reference

Licenses

Apache License 2.0

Permitted

Forbidden

Required

All Licenses

Project Security

Vulnerabilities per Version ( last 10 releases )

Project Vulnerability Report

Security Confidence Index

Vulnerability Exposure Index

Did You Know...

Code

Lines of Code

Activity

Commits per Month

Community

Contributors per Month

Languages

30 Day Summary

12 Month Summary

Most Recent Contributors

Ratings

Project Summary

Code Data

SCM Data

Community Data