15619 Project1.1 Guide

Project1.1 Sequential Programming

Goal: Write script and code to filter and analyze a medium dataset locally.

Filter

Firstly, we need to filter the data with our programs before analyzing it. I used Java, but you can use Python or Bash if you feel comfortable with them. In my program, I used a BufferedReader to check each line. Since the result data set has to be sorted, I temporarily store all matches in an ArrayList, which will be sorted at the end. However, for a larger data set, this approach is not recommended, as it will exhausts your memory. If you’re confused by some filtering conditions, please ask your TAs to make it clear, as it may affect your result greatly.

Analysis

As for analysis, we have to answer nine questions using the data set we got. If you think your answer-hunting programs are correct, you’d better go back to chcek your filter program to see whether you’ve misunderstood some conditions. For the first several questions, such as counting the total number of lines, I strongly recommand you to use Bash. One reason is because the prorams will be really simple. You may need only one line to complete your job, another reason is that you will have to use Bash in a future project anyway. Yes, you’ll not run away from it, so enjoy. For complited questions, you will use regular expression to match the works. This site can check your expressions: https://regex101.com/. To fill in the answer sheet (runner.sh), you need to open it with a compiler (I used vim), and put your code in.

For Java users, replace the “echo” with “javac” and “java” there. Please also upload your code to the same folder:

1
2
3
4
answer_2() {
javac Project1_1.java
java Project1_1
}

For Bash users, replace the “echo” with your command:

1
2
3
answer_1() {
grep -P 'Keyword' FileName.csv | wc -l
}

Test your answers by running

1
./runner.sh

If everything seems alright, submitted it. You should have unlimited tries throughout all the projects, but please confirm this with your TAs.

Generally, it’s an easy project to warm you up. I have no programming foundation, but it costs me only one day to finish. Cheers.