For the 2nd edition of the Dynamic Software Documentation (DysDoc) workshop, the organizing team wanted to push the boundary on how to engage the community in tool supported demos. Previously, we had asked participants to come to the workshop (co-located with ICSME) with a tool to demo, live, to the other attendees. One of the goals was a tool that worked on unseen data.
Academic job interview season is wrapping up, so I thought I’d capture the process from the Canada point of view.
At MSR18 in Gothenburg, I presented my work on using Bayesian inference to set software metrics thresholds. We want to set thresholds because for many software metrics, like coupling between objects (CBO), a single, global metric value (“all software objects with this value or below are maintainable”) is nonsensical, if only because programming language choice is important. So we want to tailor threshold values to some contextually relevant value (e.g., perhaps all Java code should be X or less). The question I answered is how we do the tailoring, given some contextual features.
There has recently been more discussion about software documentation (or perhaps that’s because I only see what I’m interested in… hard to say). At any rate, it seems a lot of discussion inevitably breaks down to “what tool will solve my documentation problems” (e.g., this thread). Others have tried to “fix” UML by proposing new modeling approaches (forgetting, perhaps, that the unified modeling language was spurred by exactly this proliferation of diagram notations).
I’ve read 2 papers recently (references) about using active learning to improve classification for software engineering.
A list of long, high vertical day hikes I have done and wish to do. I think looking back the most common theme to all of them was “bring more water”.
I have no idea if Columbus had to have his “India Expedition” proposal peer-reviewed, but here is my interpretation of it according to the ever-popular Heilmeyer catechism.
This year SCAM, the Working Conference on Source Code Analysis and Manipulation (located in Raleigh, NC, Oct 2–3 2016) includes an engineering track, as described here. The CFP is available here. This track will be co-chaired by myself and Jurgen Vinju. In this post I want to briefly explain what an engineering track is and why you should submit to it! 1
I recently reviewed data showcase papers for the Mining Software Repositories Conference, and I’m co-chair of the Engineering track (subsumes datasets, tools, approaches) for the SCAM conference1. I’ve worked with a number of different datasets (both openly available and closed) for my research efforts. This caused me to do some reflection on the nature of empirical data in SE.
Submit early, submit often! ↩
Andy Zaidman had an interesting presentation about test analytics. The takeaway for me was that a) people overestimate their unit test engineering (estimate: 50%, reality, 25%). But b) the real issue is convincing a developer that this unit test will improve the quality of the code. In other words, like with technical debt, or refactoring, or commenting, the marginal utility of adding a test is perceived to be low (and of course the cost is seen as high). Each new individual test adds nothing to the immediate benefit (with some exceptions if one is following strict TDD). And yet each one requires switching from the mental model of the program to the one of Junit frameworks and test harnesses.
Software quality can be automatically checked by tools like SonarQube, CAST, FindBugs, Coverity, etc. But often these tools encompass several different classes of checks on quality. I propose the following hierarchy to organize these rules.
It’s a long held view in the requirements engineering (RE) community that “if only we could do RE better, software development would be cheaper”. Here ‘doing RE better’ means that your requirements document adheres to some quality standard such as IEEE 830. For example, none of the requirements are ambiguous.
My dad had this great cup from one of his vists to COMDEX (ostensibly to keep up with the latest in the tech world, which at the time COMDEX represented). It said “Garbage in, garbage out” (GIGO), and then had the name of some failed software company.
Today we conducted our first attempt at “Critical Research Reviews” (CRR) at our workshop on empirical requirements engineering (EmpiRE) at the 2015 Requirements Engineering Conference.
Over on my employer’s blog, I’ve written up our survey results on technical debt.
This past weekend was the Steel City Codefest. The idea is that community non-profits present some problem for which an “app” would help them, and coders spend 24 hours coming up with some solution. It was a lot of fun. You can see our team’s solution at http://citipark.herokuapp.com. Our challenge was to create an easier way for people to find the city of Pittsburgh’s GrubUp food program, which offers free lunch and breakfast at 80+ sites around the city in the summer (sadly, a lot of Pittsburgh youth are food insecure).
I’ve been doing a little thinking about frameworks lately. They fascinate me as 1) a realization of the vision of ‘pluggable software’ and reusable components desired since probably 1968; 2) what you are getting into when you rely on one. This is prompted by this great post on libraries vs frameworks.
One of my favorite graphics is from Al Davis, in 1988. Aside: it is depressing how often we re-invent the wheel in this business.
(I’ve typically posted long-form entries but so infrequently … )
I’m a fan of the Cynefin framework. I find it a great tool for understanding what type of problem you are trying to solve. The notion of complex/complicated/simple is quite helpful. You could do worse then to read Dave Snowden’s blog, as he explores each of the domains in the context (most often) of software projects.
It comes down to essential vs. accidental complexity, as outlined by Fred Brooks. What we research is new ways to ‘nibble’ at the accidental complexity: new languages (GO, Swift), new abstractions (Actors vs. functional programming in distributed systems), new methodologies (random test case generation). It’s what nearly every story on Hacker News is about.
This post is spurred by a line in a paper of Walker Royce, son of Winston Royce, he of the “waterfall model” (misunderstood model). He says
I had this issue a few times:
The Circle is a novel about the tech/social networking industry, where fictional company the Circle plays the role of Twitter, Facebook and Google combined. The topic is certainly ripe for the satirizing, but I didn’t think Eggers pulled it off very adroitly. Either he was going for the brutal, over the top parody like Jonathan Swift’s _A Modest Proposal, _or he was writing it very quickly and perhaps in anger. The characters felt a little flat and one-dimensional (Mae for example was unbelievably naive) and the conspiracy (or ‘logical extension’ perhaps) was hard to believe—Americans are fiercely proud of being independent and private, so the idea that they would willingly join in with mandatory Circle membership felt off.
In the paper “The Past, Present and Future of Software Architecture”, the authors (Philippe Kruchten, Henk Obbink, and Judith Stafford) have a sidebar in which they list their selection of “Great Papers of Software Architecture”. I’ve tried to collect these papers and links thereto for future reading. Here is a bibtex file for full citations. I’ve also included the parenthetical comments of Kruchten et al. in italics, and my comments in bold. Unless otherwise noted I’ve linked directly to the PDFs (please let me know if a link breaks).
Post-doc positions in CS are a growing part of the research landscape, as seen in this figure from the CRA:
Software development is rife with references to business value, particularly in agile approaches: the Agile Manifesto declares that “Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.”
In case this helps other people:
This blog post from the excellent complexity blog Godel’s Lost Letter is on the theory behind branch and bound search. One of my favourite things about this sort of analysis is how it it can eliminate, with mathematical certainty, hours and hours of programming effort. Consider this statement:
This past semester (Winter 2012), I was the instructor for UBC’s CPSC 310: Introduction to Software Engineering. As part of the course, students must complete a large-scale software project in teams of 4–5 in 2 months. This term, I allowed some teams to use GitHub to manage the project.
My dissertation is nearing approval (touch wood) and I have started a new position as a Post-doctoral Research Fellow and lecturer at UBC. I wanted to summarize my experiences in grad school as a reflective exercise. I often found I got down on myself during the process: it is an incredible challenge to acquire a research Ph.D. at one of the top-10 computer science schools in the world. I’m extremely proud of my past selfs for persevering and allowing 2011 Neil to reap the reward, as it were. ‘Cause 2006-2008 Neils put up with a lot of sh*t.
Time for some contrariness. The current rage in the academic software research community is evidence-based practice. It’s in popular magazines, desirable in academic publications, and the subject of a new book.
subscribe via RSS