Sunday, April 6, 2014

Scala, Maven and Eclipse

Apache Spark

My plan is to experiment with Apache Spark.

Preparation

Install the following plug-ins into Eclipse:

Scala/Maven project

To create a new Scala project in Maven, follow the instructions on https://github.com/davidB/scala-archetype-simple. m2e somehow does not seem to find the archetype in the Maven catalog. Here is the command to create it from the console directly:
mvn archetype:generate -B \
  -DarchetypeGroupId=net.alchim31.maven -DarchetypeArtifactId=scala-archetype-simple -DarchetypeVersion=1.5 \
  -DgroupId=com.company -DartifactId=project -Dversion=0.1-SNAPSHOT -Dpackage=com.company
Now, add the Spark dependency to the project:
    <dependency>
     <groupId>org.apache.spark</groupId>
     <artifactId>spark-core_2.10</artifactId>
     <version>0.9.0-incubating</version>
    </dependency>
You might have to select a different version.
Next, follow the instructions on http://spark.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala to create a simple example. Note that the SparkContext object needs to know the Jar file name, so change the example accordingly. Use Maven to package and run it: mvn package exec:java -Dexec.mainClass=packe.of.your.App