CS345 - Software Engineering - Spring 2005-2006
Charlie Peck and Chris Hardie
Department of Computer Science - Earlham College


[ main | syllabus | schedule | journals | resources | mailing list ]

Lab #3 - Comma Separated Values (CSV)

Last updated:  Thursday, 09-Mar-2006 21:20:49 EST

Due in class on Tuesday March 7th, 2006.

All of the files for this assignment can be found in ~charliep/courses/cs345/labs/csv on the CS file system.

This assignment is designed to explore a common data interface technique, comma separated values (CSV). While XML may ultimately replace CSV for many application spaces, for now there is still much value in the study of CSV.

In chapter 4 of the text K&P develop the consumer side of a CSV interface, that is their example code takes as input CSV formatted data. Your assignment is to take raw data as input and produce CSV formatted output. This lab is basically exercise 4-4 with the details filled-in.

We provide the raw data files you will test with, and one version of K&P's code from chapter 4 if you want to explore it or use it for testing your code. The raw data files are streams of data, values separated with a space, records with a CR-LF pair (stream-LF in C speak), which you will read, parse, and output as CSV data with CR-LF pairs at the end of each record. The output of your program should follow the specifications found in chapter 4, e.g. strings are quoted, numeric values are not quoted, etc.

Language choice will make an enourmous difference in terms of the amount of time it takes to complete this lab. Think about how you might re-use your work in the context of the group project. Remember; reduce, reuse, and recycle whenever possible.

There is a Makefile in the directory which you are free to use. We encourage you to give it a try, there are instructions in the Makefile. There are also important instructions about testing your code contained in the Makefile which I repeat here.

K&P's code takes CSV as input and generates raw data as output. You will need to take as input raw data, produce CSV, use that as input to my kpcsv, which will generate raw data, which you can then diff with the original raw data file. This will show that your code has been implemented correctly. Here's a set of sample command lines which illustrates the process:

	$ cd your-directory
	$ ./your-program -i /clients/users/charliep/courses/cs345/csv/test-01.raw -o test-01.csv
	$ cat test-01.csv | ./clients/users/charliep/courses/cs345/csv/kpcsv.bsd > test-01.raw
	$ diff -b --brief test-01.raw  /clients/users/charliep/courses/cs345/csv/test-01.raw

diff should report no differences if your code is working correctly. Note that there are two kpcsv binaries depending on which platform you are working with, FreeBSD (quark) or Red Hat Linux (ACL). Your screenshot should show this series of steps for each of the 4 data files I provide.

You will need to turn-in the following in class on Tuesday March 7th:

  1. A printout of your code. Make sure your comments include build and usage information, command line options, etc.

  2. A printed screen shot of your program using our test-*.raw as input and our kpcsv code. This will demonstrate that your code works correctly. See the Makefile for an example of how to do this (this is useful even if you aren't going to use make.)

  3. A short write-up which describes your library. At minimum this should describe each of the entry points, their arguments, and any data structures which are utilized by users of your library. This can be in the form of a man page, see "$ man bsearch" and "$ man srand" for examples.

  4. A tarball of your code and your Makefile if you use one. Put this in ~charliep/homework/cs345/-4-4.tar.gz

 

[ main | syllabus | schedule | journals | resources | mailing list ]