Data Science with Hadoop.pdf

Data Science with Hadoop.pdf



As adoption of Hadoop accelerates in the enterprise and beyond, there's soaring demand for those who can solve real world problems by applying advanced data science techniques in Hadoop environments. Now there's a complete and up-to-date guide to data science with Hadoop: high-level concepts, deep-dive techniques, practical applications, hands-on tutorials, and real-world use cases. Drawing on their immense experience with Hadoop in enterprise Big Data environments, this book's authors bring together all the practical knowledge you'll need to do real, useful data science with Hadoop. Coverage includes:

  • What data science is, what data scientists do, and how to build or join a data science team
  • Core data science applications in retail, healthcare, insurance, banking, education, and beyond
  • How Hadoop has evolved into an outstanding environment for doing data science
  • A day in the life of a data scientist: exploration, iteration, and more
  • Getting your data into Hadoop: data lakes, Sqoop, Flume, Falcon, and more
  • Preparing your data, from start to finish
  • Data modeling and machine learning
  • Visualization: how (and how not) to use it
  • Start-to-finish case studies: recommender systems, customer segmentation, sentiment analysis, and predictive risk modeling
  • The future: Storm online scoring, GIRAPH graph algorithms, Solr/Elastic search, and more


Part 1: Data Science with Hadoop - An Overview
1. Introduction to Data Science
2. Data Science Use-Cases
3. Hadoop and Data Science

Part 2: The Process of Data Science with Hadoop
4. The Process of Data Science
5. Getting the Data into Hadoop
6. Data Preparation
7. Data Modeling
8. Visualization

Part 3: Real World Examples
9. Building a Recommender System With Mahout
10. Customer Segmentation with Kmeans
11. Analyzing Sentiment
12. Predictive Risk Modeling

Part 4: The Road Ahead
13. Advanced Topics
14. The Data Science Journey


当当网购书 京东购书 卓越购书