And it makes me think, to what degree are organizations’ code bases shaped by their formal or informal organization structure? Are core modules and root objects often the domain of senior developers and objects lower in the hierarchy the domain of juniors? My experience is that it often tends to be, and it seems a reasonable overlap: after all, you want your more trusted developers fiddling where the damage can be greatest.
But how about other attributes of the code base? In the world of Perl, are CPAN authors often hired as external consultants? Are the most communicative programmers the ones that will write network services? Are the modules most used also written by the programmers that are most in contact with others?
And does organization structure also shape general code base structure? Will a more hierarchical organization tend more towards hierarchical object structures, while more chaotic or flat organizations tend towards more chaotic or flat code organization?
A lot of questions, but no answers… But one thing that comes to mind is, if organization and code structure follows each other, is this a good idea? I think few people designing a data model or object hierarchy starts with the organization structure as a blueprint, but speaking from my own limited experience, you can often at least see a reflection of either in both. Is this a good or bad thing? What can the consequences be?
Can a programmer on your team be an overall negative asset for your project? G. Gordon Schulmayer argues that this is the case, with the NNPP, or the Net Negative Production Programmer. His point is that a substantial amount of programmers will introduce so many flaws to your code that the overall cost is higher than the value of their positive contribution.
I believe he has some beauty marks on his theory, such as that for every ten man programmer team, there is statistically a nil change of there not being a NNPP on the team. This assumes, in addition to assuming normal distribution, that your team is a representative sample of the world distribution of programming skills and productivity. I don’t think I’m going too far if I suggest that is rather doubtful for most teams.
Overall, however, the article has some advice and insight. For example, Schulmayer argues that the main cause of any programmer having a negative production is lacking management, and he tries to explain how to remedy this. Also, there are quite a few good quotes, as this one:
John Gardner, in No Easy Victories, said, “An excellent plumber is infinitely more admirable than an incompetent philosopher.” He went on to observe that if society scorns excellence in plumbing because it is a humble activity and accepts shoddiness in philosophy because it is exalted, “neither its pipes nor its theories will hold water.”
Something in the techie DNA results in more weirdness than mere mortals (non-techies). Perhaps this quirkiness is because a certain type of personality is drawn to the techie world. Or maybe we’re somehow transformed over time by our darkened working environments and exposure to computer screen radiation
Personally, I’ve meet a few weirdos, but I think weirdness and skill is generally negatively correlated. Maybe some people just can get away with it easier in the world of programming..
I’ve been wanting to post a link to this article for a while, but ever since I discovered it, research.microsoft.com has been unreachable for me, so I’ll post a small summary:
Microsoft has done research on some popular conceptions about software engineering and come up with hard numbers on some factors affecting code quality. Here are the main findings reported in the artice, with links to the research papers, in case the original is lost forever:
More test coverage does not equal better code quality, as measured by number of post-release fixes. Usage patterns and code complexity are the main reason test coverage is a poor predictor of quality.
Organizational metrics, which are not related to the code, can predict software failure-proneness with a precision and recall of 85 percent. Not only that, but organizational structure was by far the best predictor of code quality, and was at least 8% percent better than the best predictor the researchers could get from code-based measurements. (The influence of organizational structure on software quality: an empirical case study)
One drawback with this research is that this is primarily based on case studies, which is a generally poor research method for drawing general conclusions. How valid are these observations for other organizations outside Microsoft? Is the organizational structure of your project or company actually more decisive than your programming methodology?
Also, how transferable is this to other programming frameworks. In dynamic typed languages like Perl, is test coverage more important? I often find that a sub-set of my tests do what a compiler could have done in a statically typed language, or even for Perl if I just had a more automatic testing tool. So maybe coverage would be more predictive of bugs if the compiler catches fewer mistakes? That would be a good candidate for further research.
This is not on their webpage yet, but the PPIG is organizing a workshop at the School of Computing at the University of Dundee, Scotland. I don’t know if it has been announced yet, but a call for papers have gone out with deadline November 16th, and the workshop is scheduled for January 7-8.
How do programmers differ, and why should you care? Steven Clarke from Microsoft’s usability labs has identified and demonstrated at least three different programmer styles, which has been reported in quite afewplaces, hence programmers do indeed differ. The types Clarke found are:
THE SYSTEMATIC DEVELOPER: Writes code defensively. Does everything he or she can to protect code from unstable and untrustworthy processes running in parallel with their code. Develops a deep understanding of a technology before using it. Prides himself or herself on building elegant solutions. THE PRAGMATIC DEVELOPER: Writes code methodically. Develops sufficient understanding of a technology to enable competent use of it. Prides himself or herself on building robust applications. THE OPPORTUNISTIC DEVELOPER: Writes code in an exploratory fashion. Develops a sufficient understanding of a technology to understand how it can solve a business problem. Prides himself/herself on solving business problems.
Now why should you care?
Almost every mention I’ve seen of this online – or of any other personality type categorization system – is usually followed by a “which type are you?”. This misses the point utterly and completely. Psychological research like this first becomes really valueable when you stop thinking about yourself and start asking how this can help you understandother people. If you design API’s and base your design on what makes most sense to your own coding style, you will create something that two thirds of your audience will find difficult to use. Even if you don’t like or agree with their style.
Granted, that makes the assumption that programmers are always equally distributed among styles, which is a pretty wild assumption. The point is that other people are more likely to think differently than similarly to you.
That is also a good thing to keep in mind when formatting code for readability: if your coding style differs from standard Perl Tidy or your company’s coding standard, keep in mind that you are not formatting for yourself, but a colleague, maintainer or anonymous CPAN downloader. They are more likely to understand a common standard than your standard. It sounds obvious, don’t it? I don’t think many (any) programmers think like this even so.
Now, Clarke, in an article to Dr. Dobbs Journal, has an example of a cognitive mapping of programmer types and API traits which is quite illustrative. In Figure 1, thick blue lines shows the expectations of a particular programmer type, while the dark lines shows the score of a particular API. As you can see in this case, the match is bad. Now the good thing is that Clarke’s research gives you a framework to discuss how and why.
After writing about variable roles a while back, I’ve been thinking a lot about the way I use variables. The post also got quite a lot of attention, so it seems to have hit a note with people. And no wonder it does: the concept kinda promises to tell you something profound about programming. The problem is that it doesn’t really deliver on the promise.
The first observation anyone with some experience in programming will see, is that the majority of the roles are loop control variables (check the post, but basically they mention “stepper”, “most-recent-holder”, “most-wanted-holder”, “gatherer” and “follower” as distinct roles, as well as “fixed-value” and “temporary”, who also could be said to be typical loop control roles.
My first idea was that if all these are necessary to traverse loops, perhaps a historized scalar variable would be helpful. I.e. the “follower” could just be a ->prev() method, there could be a sum of the history and so on. And it turns out there is a variable history module: Tie::History (just remember to turn on AutoCommit).
Now maybe that wasn’t such a great idea. Or maybe it just isn’t thought properly through. Because what happens then is that you just start doing list operations on the history. Hence, maybe the list operations are just the way to go, and a more functional programming style will get rid off the elaborate loop control variables. Or so I suggested in the post.
Here, I’m trying out some attempts at bypassing the loop control variable roles by using a more declarative/functional programming style. I’ve made some examples of typical imperative style loops, and have tried to see I could get rid of the traditional variable roles by doing it in a declarative style. Have a look and see if it seems meaningful.
(Also, it gives me an opportunity to try out the awesome CodeColorer code viewer).
my@countries=qw( USA China Netherlands
Norway Finland Sweden );
In my alternative take, I realized I ended up with both sort of a “most-wanted-holder” and TWO temporary variables. However, that I want to store my value somewhere, I can’t help, and at least there are no iterations using it. The temporary variables are worse, however. The point was to get rid of them after all.
my@countries=qw( USA China Netherlands
Norway Finland Sweden );
One construct I vividly recognize is the “follower”, described as “A data item that gets its new value always from the old value of some other data item”. I am probably not the only one having written code balancing a pair of previous/this type variables. Here is a slightly contrived example that reports every adjacent pair of the same length:
my@countries=qw( USA China Netherlands
Norway Sweden France Finland );
This did not turn out as easy as I expected. If you try to do functional operations in Perl using core libraries, reduce is basically the only method that let you operate on list items in a non-independent fashion. And the only way I found to do this with reduce is even more contrived than my imperative example:
Now, except from the last example, I find the declarative approach to be generally far more easy to read. While that could just be because they are shorter, I don’t think that is sufficient as an explanation for that – there is plenty of compact code that would be easier to understand if it was more explicitly stated.
Now, cognitive psychology actually has an explanation of why declarative list operations like this are easier to understand. And I’ll present the evidence for that in my next post.
Computer programming perhaps more than any other manufacturing endeavor begins with a thought and through skilled application of knowledge yields an intrinsically proven object that is itself almost mental (encoded electrical information).
It’s a good argument for why cognitive psychology is relevant for computer programming, but even more important, it points out the almost mental nature of computer programs.
Physiologically, the way our brains operate is mainly through bursts of electrical current called Action Potentials, that propagate information down neurons in a on-off fashion. Some people will call it binary, but the information conveyed is typically frequencies rather than binary patterns (But this is a somewhat contentious question).
Here’s a beautiful drawing of neural communication from Wikipedia:
So you have a mental construction in a human brain contained in electrical signals, and this is transferred over to a construction of electrical signals in a computer. Add that you usually want these two representations to be identical, you have a good argument why programming language design should be based strongly on cognitive science!