Key Features
- Build your experitse in processing realtime data with Apache Flink and its ecosystem
- Gain insights into the working of all components of Apache Flink such as FlinkML, Gelly, and Table APIFilled with real world use cases,
- Your guide to take advantage of Apache Flink for solving real world problems
Book Description
With the advent of massive computer systems, organizations in different domains generate large amounts of data at a realtime basis. The latest entrant to big data processing, Apache Flink, is designed to process continuous streams of data at a lightning fast pace.
This book will be your definitive guide to batch and stream data processing with Apache Flink. The book begins with introducing the Apache Flink ecosystem, setting it up and using the DataSet and DataStream API for processing batch and streaming datasets. Bringing the power of SQL to Flink, this book will then explore the Table API for querying and manipulating data. In the latter half of the book, readers will get to learn the remaining ecosystem of Apache Flink to achieve complex tasks such as event processing, machine learning, and graph processing. The final part of the book would consist of topics such as scaling Flink solutions, performance optimization and integrating Flink with other tools such as ElasticSearch.
Whether you want to dive deeper into Apache Flink, or want to investigate how to get more out of this powerful technology, you’ll find everything inside
What you will learn
- Learn how to build end to end real time analytics projects
- Integrate with existing big data stack and utilize existing infrastructure.
- Build predictive analytics applications using FlinkML
- Use graph library to perform graph querying and search.
About the Author
Tanmay Deshpande is a Hadoop and Big Data evangelist. He currently works with Symantec Corporation as a Sr. Software engineer in Pune, India. He has an interest in a wide range of technologies, such as Hadoop, Hive, Pig, NoSQL databases, Mahout, Sqoop, Java, cloud computing, and so on. He has vast experience in application development in various domains, such as finance, telecom, manufacturing, security, and retail. He enjoys solving machine-learning problems and spends his time reading anything that he can get his hands on. He has a great interest in open source technologies and has been promoting them through his talks. He has been invited to various computer science colleges to conduct brainstorming sessions with students on the latest technologies.
Before Symantec Corporation, he worked with Infosys, where he worked as the Lead Big Data / Cloud Developer and was a core team member of the Infosys Big Data Edge platform. Through his innovative thinking and dynamic leadership, he has successfully completed various projects.
Before he wrote this book, he also authored two books, Mastering DynamoDB and Cloud Computing. He regularly blogs on his website. http://hadooptutorials.co.in