Goal: filter and analyze a large dataset using Hadoop MapReducer through AWS EMR.
- Please do not use “.” (periods) or, in general, any other non alphanumeric characters in your bucket name (the bucket to which your mapper and reducer code is uploaded), otherwise the EMR job might fail.
- You may want to preserve your cluster by unchecking the “Terminate on failure” option and adding steps manually in the EMR web console
jar –cvf mapper.jar Mapper.class
jar –cvf reducer.jar Reducer.class
java -cp Mapper.jar Mapper
java -cp Reducer.jar Reducer