

In Chapter 3 we dive into analysis followed by modeling, which presents examples using a single-cluster machine: your personal computer. In other words, you will need to do some wax-on, wax-off, repeat before you get fully immersed in the world of Spark.
#Connect java to r code
We encourage you to walk through the code in this chapter because it will force you to go through the motions of analyzing, modeling, reading, and writing data. In this chapter, we take a tour of the tools you’ll need to become proficient in Spark. We also hope you are excited to become proficient in large-scale computing.

If you are newer to R, it should also be clear that combining Spark with data science tools like ggplot2 for visualization and dplyr to perform data transformations brings a promising landscape for doing data science at scale. And it should be clear that Spark solves problems by making use of multiple computers when data does not fit in a single machine or when computation is too slow.
