Full description not available
K**A
Book
Nice book
A**R
Introducing Data Science PLUS MUCH MUCH MORE!
Loved this book! If I could have given 6 stars, I would have.This book would provide you with a very well rounded approach to Data Science and by that I mean truly would give you a ride though all the aspects of this field versus showing you some regression algorithm using python and call it Data Science.Book has it all - not only it leverages probably the most favorite language (python) for its examples, it also goes in details in supporting tools and eco systems. For examples, Spark - Why create something when Spark is already here and we can just use it in our work.It covered NoSQL technologies to give readers enough information to get started and weighted pros and cons of each. I especially enjoyed reading ACID, BASE and CAP theorem sections. I am familiar with them and gave presentation on exact same topic few years ago and I enjoyed the read since it covered the important key points leaving me with nice warm feeling in my stomach that unaware readers will be in a good hands!During discussion of NoSQL, ElasticSearch was introduced and entire chapter was devoted on how to leverage search capabilities to provide us with valuable results... Search is something that ElasticSearch does best! Section about Damerau-Levenshtein was great. It made you think of dirty data that is present in the real world and how you deal with it (vs giving you example with perfectly clean and ready to use data)Speaking of real world experience - this book took a step back and instead of trying to be data science book and throwing cool python libraries at you, it talked about general approach in the real word when you deal with data science projects by trying to make you think of project's research goals - Why are we doing this? This was done to help you think and to help you pick the right solutions.Another example of real world problems was their chapter on dealing with big and i mean truly big data. In some sample program, you can surely play with tens of hundreds of sample records, but what do you do with gigs or more of data? while running production servers, you are not dealing with 2-3 lines of log entries, you deal sometimes with gigs! So I was very happy to see section that talked on how you can tackle problems like that.Authors did a great job in my opinion by cloning and making it available pywebhdfs package that would work with their example of the code (they did use now outdated hortonworks sandbox that made it hard to follow in few chapters, but it was not hard to figure out where menus/buttons were moved)A nice final touch that I felt was great was section on results visualization. How would you communicate what you found to others? will you point them at some hard to read print out, OR shows them a picture/graph that makes your findings easy to read?So... many many gems in this book that would really give you a great overview of the field of data science and would get you started not only in strictly academic / demo only way, but also in real life production environment.I definitely would be re-reading this book and recommending it to my colleagues!
S**N
A good entry door to the huge and changing world of data science
“Data science,” the three authors of this book point out, “is a very wide field, so wide indeed that a book ten times the size of this one wouldn’t be able to cover it all. For each chapter, we picked a different aspect we find interesting. Some hard decisions had to be made to keep this book from collapsing your bookshelf!”In my view, they have made very good choices. This “Introducing” book is written well and logically organized. And it generally is aimed at individual computer users and persons contemplating possible careers in data science. The book also could be good for managers and others trying to get a handle on how some data science techniques could be brought to bear on their growing mounds of business data.If you are impatient to dive straight into dicing, slicing and graphing big data, you should know that books from Manning generally don’t follow that kind of quick approach. You get some overviews, explanations and theory first, and then you ease into the heart of the matter. In this book, you get to “First steps in big data” in chapter five, after first delving into the data science process: 1. Setting the research goal; 2. Retrieving data; 3. Data preparation, 4. Data exploration; 5. Data modeling; and 6. Presentation and automation.Chapter five also is preceded by chapters on machine learning and how to handle large data files on a single computer.The “First steps” chapter, meanwhile, shows how to work (at the sandbox level) with two big data applications, Hadoop and Spark, and demonstrates how Python can be used to write big data jobs.From there; you move on to (1) the use of NoSQL databases and graph databases, (2) text mining and text analytics, and (3) data visualization and creating a small data science application.It should be noted and emphasized, however, that “Introducing Data Science” does present “An introductory working example of Hadoop” at the end of Chapter 1. The authors explain how to run “a small application in a big data context,” using a Hortonworks Sandbox image inside a VirtualBox.It's not grand, but it’s a start.Near the beginning of their book, the authors include a wise quote from Morpheus in “The Matrix”: “I can only show you the door. You’re the one that has to walk through it.” This book definitely is a good entry door to the huge and changing field of data science.(My thanks to Manning for providing a review copy.)
Trustpilot
2 months ago
1 week ago