Text

Ping

Link

Link

Data-Intensive Text Processing with MapReduce

Link

MapReduce Patterns, Algorithms, and Use Cases

Quote
"Data Journalism Handbook 1.0 BETA"

Welcome - The Data Journalism Handbook

Link

Exploratory data analysis (EDA)
Learning to explore data with plots

Quote
"The machine learning toolbox’s focus is on large scale kernel methods and especially on Support Vector Machines (SVM)"

shogun | A Large Scale Machine Learning Toolbox

Link

This opinionated guide exists to provide both novice and expert Python developers a best-practice handbook to the installation, configuration, and usage of Python on a daily basis.

Link

MADlib is an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine-learning methods for structured and unstructured data.

Quote
"GGobi is an open source visualization program for exploring high-dimensional data. It provides highly dynamic and interactive graphics such as tours, as well as familiar graphics such as the scatterplot, barchart and parallel coordinates plots"

GGobi data visualization system.

Tags: ml viz dm book
Link
Link
Link
Link

DataKind (formerly known as Data Without Borders) brings together leading data scientists with high impact social organizations through a comprehensive, collaborative approach that leads to shared insights, greater understanding, and positive action through data in the service of humanity.

Link

Some data and machine learning talks videos from PyCon Us 2012

Tags: ml dm python