Software Engineering Journal

2006-04-06
  • Design We opted for extreme simplicity in our interface and code. We used Perl for rapid development and also the existance of a large number of useful modules. The interface between our different components is STDIN/STDOUT, using XML for structure.
  • Role We decided from the beginning that we were going to run our project in as communal and non-hierarchical way as possible. Despite that, Dave and I kind of assumed the role of leader. Specifically, Dave was the Group Hack, while I did more to organize the group and our meetings.
  • Group dynamics I feel like the group successfully designed a simple and effective solution for the assignment. We've worked well together, and we've used the available communication media well. On the other hand, I feel like the workload has been somewhat assymetric, and that we need to do more to convey our methodology to the other group members.
  • 2006-03-31
    2006-03-28
    2006-03-09

    Data mining for schools

    • It seems like better data mining for educational funding would be useful. The information from schools --- such as funding, grades, state testing results, after school programs --- is there, but in many cases the data is not easily accessible. For example, fraud in the Pell Grant system possibly accounts for hundreds of millions of dollars per year, but the data processing required to find the fraud is high. If that could be brought down, the Pell Grants could better serve those it is meant to serve.
    • Being able to "slice" educational data better would be useful as well. Charter schools, to choose one case, are relatively new in many areas. While school districts collect data on individual schools, these tend to be combined and averaged above the district level. Keeping a distinction between charter schools and regular public schools is as useful as separating public schools and private schools.
    • And, in relation to current events, there are health benefits to better data mining techniques. Dana Reeves, who never smoked in her life, recently died of lung cancer. Unfortunately, study into lung cancer lacks funding because of the stigma of smoking attached to it. Better data mining would enable a truer picture of lung cancer and its victims, and enable better research into it.
    2006-03-02

    Software portability

    • My experience writing and maintaining agrees with ESR. It basically does come down to using already-supported tools whenever possible. For example, when I need to transfer a text file from one computer to another, I don't write my own network protocol. I use something existing such as ssh. I use system calls whenever possible to avoid differences in architecture, and use a language that is portable across as many machines as possible. As far as the future goes, I don't see a significant change in how software is distributed anytime soon. The infrastructure simply isn't in place yet to handle the move from the current distributed computing environment back to a centralized computing environment. Google is apparently heading in that direction, but the problem is that it could irrevocably keep the source hidden for any application they use, because they would never even have to distribute the application.
    2006-02-24

    How many tests?

    • I would say it depends on what I'm trying to do. I would try to test all lower and upper bounds on input, as well as several inputs between the two bounds. If I'm going for statistical relevance I would say at least five repeats of each test should be done. The number of tests I run also depends is how much time I have to perform the tests, and a cost/benefit analysis of whether the test will be profitable given the current conditions.
    2006-02-21

    Questions from Bill Joy:

    • Was Oppenheimer right when he said there is no aspect of science so terrible that it should never be explored?
    • Should it be humanity's destiny to forever expand, in any sense of the word? Should there be a point when we decide that enough is enough, or even attempt to live more simply?
    2006-02-08
    • I started by reading all of the IMAP RFC (RFC 3501). This gave me a good background on the command syntax of IMAP.
    • Next, I did a packet capture during a mutt IMAP transaction against quark. (This later turned out to be unnecessary when I figured out how to enable mutt's existing debug code).
    • I did a grep for the commands that pulled the headers and messages down. I tracked down the two files, imap/{commands,message}.c.
    • I replaced the BODY.PEEK[] command with a (BODY.PEEK[HEADER] BODY.PEEK[1]) command. This should have the effect of only grabbing the header and the first attachment rather than the entire message. Unfortunately, this breaks mutt_copy_message() in copy.c, which is called by mutt_display_message() in commands.c. This appears to be because the fetched message is interpreted as control data from the server rather than a message. The only difference I can see between the debugging output of the two transactions is imap_read_literal() is called in the successful one but not the failed one. I've made the successful and failed runs web accessible.
    • My plan is to track down why imap_read_literal() isn't called.
    2006-02-03

    For my project, I'm going to work on a small bug in mutt that causes large attachments to be downloaded over IMAP even when the user only views the message body. This will be a significant performance boon for those on low-speed links, or those who pay by the bit. It is PR 49 at bugs.mutt.org.

    My question is this: What benefits can open-source software have for education? More specifically, how difficult would it be to develop software that can benefit people with communications disabilities (speech, writing, etc.), which is an area where approaches are both highly proprietary and expensive?

    2006-01-27

    My favorite language depends on what I'm doing. I use C when I need consistency or access to low-level memory management. I use Perl for work that needs to be done fast, or that can benefit from regular expressions while still being retaining much of C's strength.

    2006-01-20

    I have two of them:

    • Parallel programming
    • Automated testing, especially as applied to systems administration.