Afbeelding auteur

Werken van Douglas Eadline

Tagged

Algemene kennis

Er zijn nog geen Algemene Kennis-gegevens over deze auteur. Je kunt helpen.

Leden

Besprekingen

Summary

The book concerns the concepts and operation of a “Big Data” environment using the Apache Hadoop 2 ecosystem. As well as the usual introductory sections it contains 10 major sections and 5 appendices.

The book really does take you from soup to nuts, as they say in the US, starting with an introduction to the concepts and history of Hadoop and Big Data, through installation, file system basics, MapReduce Framework & Programming, Hadoop Tools (including Yarn applications), and finally the management and administration of Hadoop under Apache Ambari. The book also has its own web site, complete with code downloads, question & answer forums, resources links and update information.

The guide does start at the very beginning for the complete novice user, taking them through a step by step process to install Hadoop in a single platform environment for a virtual Hadoop sandbox (Hortonworks HDP [Hortonworks Data Platform] Sandbox to be precise) or pseudo distributed mode. The former being available for Microsoft or Apple operating systems. The latter, while more complex, does more closely resemble a fully operational Hadoop environment. Normally, the Hadoop environment uses a cluster of servers running in a data centre setup, but this Quick Start Guide provides the necessary process to implement on a stand-alone desk or lap-top, for personal use and evaluation. Obviously, this does restrict the size of data involved and the analysis that can be undertaken, but it does provide an introduction for the individual approaching “Big Data” for the first time.

In a similar manner the book then takes the reader through the full operation of the Hadoop 2 system with code examples were necessary. All this can therefore be used by both the novice or more experienced users using the full blow operational Hadoop environment.

The structure of the book is also linked to the video tutorials, Hadoop Fundamentals: Live Lessons and Apache Hadoop Yarn Fundamentals: Live lessons, also produced by Douglas Eadline and Addison-Wesley, so that the two can be used in conjunction. The author suggests that this may be the best approach for taking on board the subject matter.

Review

In essence there is something in the Hadoop 2 Quick Start Guide for everyone, from some that just want to see what all the Hadoop noise is about, to those that are regular Hadoop users or administrators. The format used is excellent for this type of book, and one that should perhaps set the standard for other ‘quick start’ guides. The instructions and code examples are easy to follow and provide all the required background. The layout also aids the reader who wants to pick and choose what they read, dependant on their needs at that time, while still providing for the reader who needs to see the whole picture.

Particularly interesting was the section on HDFS (Hadoop Distributed File System) which provides information on the background to the chosen structure for its storage and command environment.

One of the Appendices even gives a summary of the additional resource content in the full sections so that the really high level ‘helicopter’ reader is also served.

Obviously, as the title suggests, there is more detail to be had and I look forward to reading Douglas Eadline’s books at that level as well.
… (meer)
 
Gemarkeerd
lkeighley | Nov 5, 2019 |
Overview of issues involved in setting up a high-performance computer cluster
 
Gemarkeerd
frogman2 | May 1, 2010 |

Statistieken

Werken
2
Leden
7
Populariteit
#1,123,407
Waardering
4.0
Besprekingen
2
ISBNs
5