Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Good strategies for developing throwaway code?

I frequently write throwaway code (in a research environment) - for example to explore an algorithm or a model for a scientific property or process. Many of these "experiments" are one-off but sometimes I find that I need to use a few later. For example I have just unearthed code for string matching I wrote 7 years ago (stopped because of other priorities) but which is now valuable for a coworker's project. Having looked at it (did I really write such impenetrable code?) I realise there are some things I could have done then to help me when I restarted the "project" ("experiment" is still a better word). The earlier experiment "worked" but I know that at the time I would not have had time to refactor as my priorities lay elsewhere.

What approaches are cost-effective in enabling such work to be dug up and re-used?

EDIT: I have answered my own question (below) because there are issues beyond the actual source itself.

like image 331
peter.murray.rust Avatar asked Sep 03 '09 15:09

peter.murray.rust


People also ask

What is throwaway code?

Throwaway code refers to the temporary software programs and routines created to help in the coding, testing, conversion, and implementation of the final (deliverable) system.

How do you code consistently?

Don't go for a longer coding session in a day and skipping the next few days. Try not to think about code each and every time your brain will break down. Unclear boundaries between work and break. You enjoy coding but don't enjoy the process it could be a lack of boundaries.


2 Answers

I disagree with all of the answers saying "write comments". That's being offered as a catch-all for the code itself not being understandable.

Get yourself a copy of Code Complete (Steve McConnell, 2nd edition). If you learn the techniques of writing maintainable code in the first place, it won't take you more time, and you will be able to return to your work later with less trouble.

Which would you prefer:

  • Cryptic code with comments?
  • Mostly OK code without?

I strongly prefer the latter, as the OK code is easier to understand in the situations where the cryptic code was uncommented, and comments are another place that the original developer can make mistakes. The code may be buggy, but it's never wrong.

Once you're comfortable with Code Complete, I'd recommend The Pragmatic Programmer, as it gives slightly higher-level software-development advice.

like image 133
Phil Miller Avatar answered Oct 22 '22 14:10

Phil Miller


[Answering own question] There are several other aspects to the problem which haven't been raised and which I would have found useful when revisiting it. Some of these may be "self-evident" but remember this code was pre-SVN and IDEs.

  • Discoverability. It has been difficult actually to find the code. I believe it's in my SourceForge project but there are so many versions and branches over 7 years that I can't find it. So I would have to have a system that searched code and until IDEs appeared I don't think there was any.
  • What does it do?. The current checkout contains about 13 classes (all in one package as it wasn't easy to refactor at the time). Some are clear (DynamicAligner) but others are opaque (MainBox, named because it extended a Swing Box). There are four main() programs and there are actually about 3 subprojects in the distrib. So it is critical to have an external manifest as to what the components actually were.
  • instructions on how to run it. When running the program, main() will offer a brief commandline usage (e.g. DynamicAligner file1 file2) but it doesn't say what the contents of files actually look like. I knew this at the time, of course but not now. So there should be associated example files in sibling directories. These are more valuable than trying to document file formats.
  • does it still work?. It should be possible to run each each example without thinking. The first question will be whether the associated libraries, runtimes, etc. are still relevant and available. One ex-coworker wrote a system which only runs with a particular version of Python. The only answer is to rewrite. So certainly we should avoid any lock-in where possible, and I have trained myself (though not necessarily coworkers) to do this.

So how can I and coworkers avoid problems in the future? I think the first step is that there should be a discipline of creating a "project" (however small) when you create code and that these projects should be under version control. This may sound obvious to some of you, but in some environments (academia, domestic) there is a significant overhead to setting up a project management system. I suspect that the majority of academic code is not under any version control.

Then there is the question as to how the projects should be organized. They can't be on Sourceforge by default as the code is (a) trivial and (b) not open by default. We need a server where there can be both communal projects and private ones. I would calculate that the effort to set this up and run it is about 0.1 FTE - that's 20 days a year from all parties (installation, training, maintenance) - if there are easier options I'd like to know as this is a large expense in some cases - do I spend my time setting up a server or do I write papers?

The project should try to encourage good discipline. This is really what I was hoping to get from this question. It could include:

  1. A template of required components (manifest, README, log of commits, examples, required libraries, etc. Not all projects can run under maven - e.g. FORTRAN).
  2. A means of searching a large number (hundreds at least) of small projects for mnemonic strings (I liked the idea of dumping the code in Googledocs, and this may be a fruitful avenue - but it's extra maintenance effort).
  3. Clear naming conventions. These are more valuable than comments. I now regularly have names of the type iterateOverAllXAndDoY. I try to use createX() rather than getX() when the routine actually creates information. I have a bad habit of calling routines process() rather than convertAllBToY().

I am aware of but haven't used GIT and Mercurial and GoogleCode. I do not know how much effort these are to set up and how many of my concerns they answer. I would be delighted if there was an IDE plugin which helped create better code (e.g. "poor choice of method name").

And whatever the approaches they have got to come naturally to people who do not naturraly have good code discipline and to be worth the effort.

like image 20
peter.murray.rust Avatar answered Oct 22 '22 13:10

peter.murray.rust