Posts Tagged “software”

@inproceedings{wu_using_2002,
address = {Paris, France},
title = {Using Graph Patterns to Extract Scenarios},
url = {http://swag.uwaterloo.ca/\~j25wu/papers/iwpc02-wu.pdf},
booktitle = {International Workshop on Program Comprehension},
author = {Jingwei Wu and Ahmed E. Hassan and Richard C. Holt},
month = jun,
year = {2002},
pages = {239-247}
}

Grok is a tool from Waterloo that can do algebraic manipulations of graph structures. In this paper, the graph structures are source code relationships like ‘calls’ or ‘uses’. They first extract ’scenarios’ from the software model, which seems similar to my desire to extract high-level goals from requirements specifications or documentation. Again, the challenge is a reconstruction one, and seems very similar to software histories. Since we don’t have access to the decisions and thought-processes at the time of design, we must reconstruct these (albeit imperfectly) from original documents. Although imperfect, it isn’t much different than a historian analyzing Napoleon’s decision to attack Russia in 1812.

Source code extraction has better fidelity. After all, most version control systems allow you to checkout code from a specific date, preserving perfectly the state of the software at that point in time. However, this can only ever reproduce the facts on the ground at that point in time (e.g., bugs, design, architecture) but not what is, in my opinion, the more interesting rationale insights (why this design?). Similarly, we know Napoleon ended the invasion with massive casualties — but have to settle for reasonable estimates as to why this disaster (or triumph) happened.

The target system here is Gnome’s Nautilus. The authors first generate facts about the source code, then manipulate this ‘factbase’ with Grok in order to produce something that can be used for architecture recovery. Using this knowledge, they then generate scenarios (e.g. interprocess communication via CORBA) and then validate with facts extracted. There seems to be a certain degree of ‘find the facts which fit our scenario’, and one has to wonder to what extent the extraction process was ‘massaged’ to produce the graph that supported the scenario. Perhaps a more reasonable approach would have been to use a portion of the code to generate the patterns, and then search the remainder for similar matches (supervised classification). I think this issue is a common one in software engineering science. The Grok pattern matching extension seems very similar to the RDF query language SPARQL.

[One problem] was identifying a useful set of scenarios, because substantial knowledge about the important tasks of the system were required. Our experience showed that such knowledge could be accumulated during architecture extraction. In our analysis, the trial-and-error method was applied to extract useful scenarios. We carried out many iterations of defining patterns before we got useful matches. (p. 8/9)

The ability to query a factbase is very powerful, especially as a precursor to generate visualizations. In general, with large knowledge bases, it is silly to merely generate a large graph visualization without first manipulating the data to support user-defined queries. This paper confirms the utility of a graph-matching approach to pattern extraction, but highlights the difficulty that remains, namely, gaining sufficient understanding of the system to be able to pose lucid queries.

Tags: , , , ,

Comments No Comments »

What is software development?

  • a craft — knowing how to elicit requirements, design test cases, implement design patterns, etc., is a craft and a process of continuous learning. Craft for me refers to an innate sense for how to do something. Good craftspeople may not be taught, they may be born.
  • a co-operative game — Cockburn’s notion that software development is not made up of good and bad decisions, but rather, like reaching the South Pole, moves that bring one closer or farther from the goal. My quibble with this concept is that if we see software development as a Wicked Problem, than the goal is poorly or not defined, and determining what gets one closer to or farther from that goal is impossible. For example, is not documenting this design process moving us closer, in that it allows more time for testing and development, or bad, in that next round, we won’t remember why we made that decision, and might have to redesign it. It would be a cooperative effort to reach a South Pole that is moving randomly each night. Cockburn addresses this by defining the goal fairly narrowly — when the software is delivered.
  • lean manufacturing — Cockburn believes agile devlopment is essentially Japanese kaizen in a different form. For example, he illustrates decision dependencies (below), maps of where processes are bottlenecked.

Since becoming a grad student in software ‘engineering’ and requirements ‘engineering’, I’ve given a lot of thought to the ‘engineering discipline of software’. Shaw starts her seminal paper ‘Towards …’, implying one day we might reach the goal. I’m not convinced. I don’t have much industry experience to back me up, but it seems to me that agile techniques, in particular Cockburn’s empirically-backed statement that there is no obvious correlation between project success and process choice, suggest that the grand vision of engineering of software is misguided.

That isn’t to say that software development is all craft, either. Instead, I think what Cockburn is suggesting is that good software development is a combination of craft, softer skills like people management, and useful tools and practices, such as test-driven development, object-oriented analysis, web frameworks, etc.

In an interview, he suggests this is actually what engineering is about as well. For example, he mentions the Wright Bros. The analogy some draw between them and software is that it’s all about throwing yourself into a makeshift project and givin’ ‘er. Instead, he notes that the Wright Brothers used a wind tunnel, made copious notes on what worked and what didn’t, and applied well-understood laws of physics to their attempt.

The real challenge with the agile development approach (the four points) is the very real focus on working products. Agile-oriented projects force people, particularly managers, to produce solutions. I think one of the comforts of rigorous methodology is that it allows people to blame the process when something fails. Agile removes that security blanket and puts the responsibility for facilitating communication, identifying needs, solving problems, on the team itself, rather than the process.

Tags: , , ,

Comments 2 Comments »

A software system is a human-created artifact. And yet, even though humans are in control at every stage in the process, these systems often exhibit unanticipated effects. Most attention is focused on undesirable effects, such as the Ariane 5 failure. Presumably there are also unanticipated and unexpected positive effects that we aren’t told of, for the obvious reason that things work as expected, and there is no reason to question why.

Since software is human-controlled, there should be no obvious reason to compare it to natural systems like gene networks or chemical reactions. However, as Stuart Kauffman notes in “At Home in The Universe”, humans are restricted to guessing about the future. As such, we cannot anticipate all potential scenarios. This makes human-created artifacts subject to the same randomness that many natural processes are. Consider the biological notion of ‘fitness landscape’. For a given fitness landscape there is a global maximum that describes the best adapted genotype. Clearly, we should strive to have all software reach this global maximum. But, as I have mentioned, we cannot control all the variables, since we can’t anticipate all occurences and changes in the environment. This means we can’t know ahead of time what the environment will be like when we build our software. If we knew, then we would clearly strive to build to this global maximum. Instead, we attempt to build to some pre-specified local maximum, that for us is best-adapted to our environment (see Jackson). However, even this is unlikely. For example, we may abandon certain features as deadline pressures mount. One way of looking at this is that we are changing the environment that we are targeting (abandoning requirements of the environment).

How might software mimic dynamical systems, like biological evolution? Evolution is essentially a process of striving towards these peaks in a fitness landscape. Evolving software should seek to improve its adaptation to a given set of environmental constraints (fitness pressures). The landscape shifts as new environmental constraints and conditions become known, so our software has to adapt itself (reproduce? variation? genetic drift? somehow adapt). Successful software will move to a new local maximum on the new fitness landscape. Constantly shifting environments might reflect an unstable ecology, in which maxima are impossible to find (rugged). Other landscapes might be extremely stable. For example, a long-running satellite might have very few environmental changes. We can use this landscape metaphor to model why different software seems to have different requirements.

I want to digress for a minute to discuss criticality. In small systems, we might not see any complexity effects. For example, code to post blog entries might seem easy to get mostly right (to hit a good local maximum). Why all the fuss? My contention is that at some threshold, which will change depending on what the environment is, software shifts from simple to complex. This notion is similar to laminar vs. turbulent flow in fluids. At some threshold, what was a simple-to-model system becomes nasty and impossible to predict. We need simulations and approximation algorithms to understand it. Well, I think the same is true of certain software systems. They may cross this threshold and have unpredictable effects - and this in a system in which we know all the inputs and all the outputs. Even with human control, in other words, we cannot truly control the outcomes.

Tags: , , ,

Comments No Comments »