Recent blog posts


In one of my previous blogs I mentioned Moneyball: The Art of Winning an Unfair Game by Michael Lewis as an inspiration for my job as a software analytics specialist. I got the book from one of my customers when I finished the implementation of a measurement approach and the setup of a performance dashboard for his software engineering projects. “You must read this. It’s a great example of how statistics are used in decision making”.

Off course he was right. Customers are always right.

Later on, in 2011, a film based on the book starring Brad Pitt was released. Brad Pitt plays the role of Billy Beane, a general manager of the Oakland Athletics, a baseball team that adopted an analytical, evidence-based, sabermetric approach to assemble a competitive baseball team, despite Oakland’s worrying financial situation. What I’d liked most in the book is the fact that Lewis suggests that where expert judgment is not always a good starting point for decision making, statistics might make the difference. And a second thing that I like is the idea that a team that is certainly not a big spender, the Oakland Athletics had the third-lowest team payroll in the league in 2002 (about $40 million), can make big wins.

See the following excerpt of Moneyball and experience how Brad Pitt tries to understand “What’s the problem?” when talking to his team of experts.

The impact of Moneyball is huge.

As Wikipedia states “Moneyball has entered baseball's lexicon; teams that appear to value the concepts of sabermetrics are often said to be playing ‘Moneyball’. Baseball traditionalists, in particular some scouts and media members, decry the sabermetric revolution and have disparaged Moneyball for emphasizing concepts of sabermetrics over more traditional methods of player evaluation. Nevertheless, Moneyball changed the way many major league front offices do business. In its wake, teams such as the New York Mets, New York Yankees, San Diego Padres, St. Louis Cardinals, Boston Red Sox, Washington Nationals, Arizona Diamondbacks, Cleveland Indians, and the Toronto Blue Jays have hired full-time sabermetric analysts. When the New York Mets hired Sandy Alderson – Beane's predecessor and mentor with the A's – as their general manager after the 2010 season, and hired Beane's former associates Paul DePodesta and J.P. Ricciardi to the front office, the team was jokingly referred to as the ‘Moneyball Mets’.”

Statistically the best coach for Brentford

In a recent article in a Dutch newspaper, NRC Handelsblad, titled “Statistically the best coach for Brentford”, Marinus Dijkhuizen was interviewed.

“With difficulty he kept Excelsior (a rather low ranked Dutch football club) in the major league, yet he forced a job as coach of a British football club, Brentford FC. How? Statistics. Like everything at the club is determined by statistics. Tomorrow awaits the first league match in London”. Apparently when big money is in stake statistics can make a difference.

Football is about millions of Euros. But nowadays software engineering is too.

Okay. Football is about millions of Euros. But nowadays software engineering is about millions too. Companies in banking and telecommunications and large government agencies spend tens, or even hundreds of millions each year on software engineering activities. And portfolios are growing over time. So why are these software companies not making more use of statistics in their decision making?

Within the Software Engineering Research Group TU Delft of the Delft University of Technology we perform research on the subject of how to quantify and qualify software engineering activities in order to help software companies to deliver the most value out of their efforts on building, enhancing and maintaining software. A subject that we dived into recently is pricing of software projects.

Pricing via Functional Size - A Case Study of a Company’s Portfolio of 77 Outsourced Projects

In a research paper “Pricing via Functional Size” we describe a case study on 77 outsourced software engineering projects that are performed in a medium-sized west-European telecom company. The organization experienced a worsening trend in performance, indicating that the organization did not learn from history, in combination with much time and energy spent on preparation and review of project proposals by the main Indian supplier that was responsible for design, build and test activities in the projects.

In order to create more transparency in the supplier proposal process a pilot was started on Functional Size Measurement pricing (FSM-pricing). Both the telecom company itself and the main Indian supplier joined efforts in the pilot to prepare project proposals that were priced (fixed price) based on function size solely. The function point analysis was performed as a part of the project proposal process where the Indian supplier performed the actual function point count and the telecom company reviewed results.

In our research paper we evaluate the implementation of FSM-pricing in the software engineering domain of the company, as an instrument useful in the context of software management and supplier proposal pricing. We analyzed 77 finalized software engineering projects, covering 14 million Euro project cost and a project portfolio size of more than 5.000 function points. We found that a statistical, evidence-based pricing approach for software engineering, as a single instrument (without a connection with expert judgment), can be used in the subject companies to create cost transparency and performance management of software project portfolios.

A Summary of the Results of the Survey Analysis

This list summarizes the findings from our analysis of the survey results:

  • Interactions, communications, people
    • Improved proposal transparency
    • Improve knowledge of Function Point Analysis and FSM-pricing
    • Discussion on size when lower price is expected or on waivers
  • Organization, processes
    • Uniform, standard and simplified process
    • Too small projects; no focus on release-based working
    • Delay due to search for clarity and review
    • Improve pricing tables (e.g. benchmarking, more realistic figs.)
    • Promote release-based working based on size
    • Promote pricing tables based on applications (technology)
  • Measurements
    • Perform gap-analysis on FSM-price versus actual effort spent
    • Requirements
    • FSM-pricing does not cover non-functional requirements
    • Low reliability of FSM-pricing when compared to actual effort
    • Improved Requirement Management
  • Artifacts
    • Good quality of Function Point Analysis process and products

Happy stakeholders and an operational approach

The telecom company used the pricing approach for more than a year, with happy stakeholders in both the telecom company and the Indian supplier that performed the projects. A survey among twenty-five stakeholders from both the telecom company and the Indian supplier revealed that overall people experienced the transparency of project proposals to be improved, almost everyone wanted the approach to become an operational practice. We found some things to be improved too. So, developers from the Indian supplier told us that the approach that was based on functional size solely, did not enough took non-functional requirements into account. And on the contrary, management of the telecom company asked for a higher coverage of projects that were in scope of the approach.

In a way we experienced a win-win here. The telecom company got more grip on the decision process of software engineering costs and the Indian supplier was triggered to deliver the highest possible value (measured in function points) to its customer as possible for a fixed amount of money per function point. But, as always, good things can end too.

Decision makers come and go. And the operational practices follow.

Decision makers come and go. And new decisions are made on how to proceed with innovation of software engineering. Finally the management team of the telecom company stopped the pricing approach and went for the agile highway. And with that decision making based on statistics changed.

Agile coaches don't like statistics. They see it as waste.

Where at first statistics were used for decision making on project prices without intervention of expert judgment, now agile coaches advise the company where to go. And, as in many other software companies that go agile, agile coaches don’t like statistics. They see it as waste. The best they came up with was badly measured velocity, in my eyes a sort of prehistoric variant of a productivity measure.

Okay, this is a blog so I can state blunt things.

With the risk that people might judge me as one of those elderly they-know-what’s-good-for-you experts as depicted in Moneyball’s “What’s the problem?” scene as shown above, I still do regret that the management team of the telecom company did not opt for a smashing combination of the pricing approach based on functional size and an agile approach on shortening time-to-market and improve process quality. In this way the price of projects and the functionality to be delivered would be fixed, leaving all management attention to go for shortening durations and improving quality, without the often supposed trade-off effects between cost, time and quality.

Especially in today’s agile world we do need decision models that do not solely rely on beautifully drawn, but often somewhat naive frameworks, a deluge of agile gurus that know how everything in the whole world is connected, and the foolish idea that making proper designs and collecting data for statistical analysis purposes is waist. Wow, that was a nice sentence to write down!

Software engineering is not always about big data. But it sure is about big money.

Simply stated: contemporary software engineering is not always about big data. But is sure is about big money. We need to start treating our decision making that way too. We need skilled software analytics experts. But maybe more we need managers that think like Billy Beane.

Hits: 3994

In my November 2014 blogpost on Is there a relation between project size and value of a software project? I announced that within our Software Engineering Research Group of Delft University of Technology we were building a tool that is based on a so-called Cost/Duration Matrix. You can use the Software Project Benchmark tool yourself!

A spin-off between research and industry

The Software Project Benchmark tool is made available by my own company Goverdson, a Netherlands based IT-company specialized in evidence-based software portfolio management. The tool is a spin-off of close cooperation between Hennie Huijgens (owner of Goverdson) and Georgios Gousios (assistant professor at the Digital Security Group of Radboud University Nijmegen); both of them met when being a member of the Software Engineering Research Group of Delft University of Technology.

The tool itself is built in R, a language and environment for statistical computing and graphics that’s been widely used in scientific environments. R provides a wide variety of statistical and graphical techniques, and is highly extensible. The R language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

A secure open source solution

The Software Project Benchmark tool is an open source solution. We are convinced of the importance of IT-organizations to get more atten­tion for evidence-based portfolio management. Therefore, our tool is available for free and anyone can use the research data that's collected by us. The Software Project Benchmark is a safe tool to use; we do not store any data of the users and the data that is uploaded to the tool. The results of the benchmark are not recognized, used or stored by us. In result we ask the users of the tool to make us happy by letting us know what their experiences with the tool are and where they see room for improvement.

It's a beta version!

The Software Project Benchmark tool is a beta version of a hopely more mature benchmark tool that we plan to publish in the future. Therefore the available functionality is limited. For example; we have no print function yet and we apologize upfront for any unforeseen error. But we thoroughly tested the benchmark functionality and the underlying calculations and the tool is a result of long and thorough research.

The core of the tool is the Cost/Duration Matrix

The Cost/Duration Matrix is a figure that is constructed based on the combination of two linear regressions:

  1. A linear regression (power fit) where all software projects from the research repository are plotted in project size (measured in function points) versus project duration (measured in months). This plotter chart indicates what projects score below the average trend line with regard to project duration, meaning the project duration was shorter than average, and what projects score above the average trend line, meaning the project duration was longer than average.
  2. A linear regression (power fit) where all projects from the research repository are plotted in project size (measured in function points) versus project cost (measured in euros). This plotter chart indicates what projects score below the average trend line with regard to project cost, meaning the project cost are less than average, and what projects score above the average trend line, meaning the project cost are higher than average.

Based on this linear regression for each project in the research repository two performance indicators are calculated as a measure for the performance with regard to project cost (Productivity; measured in cost per function point) and project duration (Time-to-Market; measured in days per function point).

For each project the measure of deviation from the average trend line is calculated and expressed in a percentage; negative when below the average trend line, positive when above the trend line. Based on this percentage all projects from the repository are plotted in a so-called Cost/Duration Matrix, resulting in four quadrants. Each quadrant is characterized by the measure of negative or positive deviation from the average trend.

Four quadrants indicate project performance

In the figure above the four quadrants are indicated. The four quadrants in the Cost/Duration Matrix above can be explained as follows:

  1. Good Practice (upper right): This quadrant shows projects that scored better than the average of the total repository for both cost and duration.
  2. Cost over Time (bottom right): In this quadrant projects are report-ed that scored better than the average of the total repository for cost, yet worse than average for duration.
  3. Bad Practice (bottom left): This quadrant holds projects that scored worse than the average of the total repository for both cost and duration.
  4. Time over Cost (upper left): In this quadrant projects are plotted that scored better than the average of the total repository for dura-tion, however worse than average for project cost.

As the figure shows the projects that are depicted in the Cost/Duration Matrix can have different colors. The color of a project is an indication of the measure of a third performance indicator that is calculated: the so-called Process Quality of a specific project.

No color (white) indicates that no defects (errors) were measured for a project. A red color indicates a worse than average Process Quality; relatively many defects were measured in comparison with other projects of the same project size in the research dataset. A green color indicates a better than average Process Quality; relatively less defects were measured.

Project data in a research repository

An important feature of the Software Project Benchmark tool is the availability of a large research repository, including data of almost 400 finalized software projects, which is the source for the benchmark functionality.

Qualitative data is recorded for three research categories: the applicable business domain in which a software project was delivered (e.g. Mobile Apps, Payments, Data Warehouse), the primary programming language that was used within the software project (e.g. COBOL, .Net), and the delivery model that was applicable for the software project (e.g. waterfall, Scrum).

And this is what the tool looks like

The Software Project Benchmark tool allows organizations to compare the project cost, the duration of a project and the delivered quality of a subset of their own completed software projects with that of projects in the research dataset of Goverdson. The tool provides an overview of the performance of an organization, measured on a software project portfolio as a whole. This makes it an excellent tool for stakeholders in software portfolio management to gain insight into those aspects in which their own organization excels and also the potential for improvement in the IT portfolio.

The Software Project Benchmark tool offers end-users functionality to build a template-based input-file with project data that is collected and measured within their own company, analysis and benchmark their data against the research data set of Goverdson, set peer group definitions for comparison and adjust configuration settings.

A detailed overview of the tool, its background, the offered functionality and an overview of guidelines for interpreting the outcomes of the tool are to be found in a User Manual of the Software Project Benchmark tool.

Future developments

At the moment we are busy developing new research ideas that we can build upon the tool. To give you an idea of what we are working on: we are curious on how to enrich our model with quantified information about value. In our current model we notice that software projects that end up in the so-called Bad Practice quadrant (they scored worse than average for both cost and duration) might represent a high amount of value for the company. Or the other way around: a project ending up as Good Practice might not be good for any value at all. Keep you posted on this in future blogposts!

Enroll the tool!

But for now, start the Software Project Benchmark tool yourself! Just try it out. Nothing can go wrong, It's perfectly safe and you can fool around as long as you like.

But please tell us about your experiences! Did you like the tool? did you hate it? Do you think measurement sucks? Can we improve things? We'd like to know!

Have fun!

Hits: 4339

I have to admit, I am a clumsy Twitter klutz. Yet, lately I noticed a tweet that really triggered me. On the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, that took place from the 18th to the 20st of November in Hong Kong, Magne Jørgensen of Simula Research Laboratory in Norway did an invited talk about ‘Ten Years with Evidence-Based Software Engineering. What Is It? Has It Had Any Impact? What’s Next?’

I was not there, but people who were twittered enthusiastic about the raised question: “Is agile better than traditional methods?” The answer, given in the keynote, was that such a question is not answerable. Another tweeter told me about “Confirmation of confirmation bias: Magne Jørgensen shows that believers in agile see proof in random data”.

“Believers in agile see proof in random data”

In his talk Jørgensen told about an experiment where fifty developers from a Polish company, with a strong believe in agile, were asked to answer the question “are agile methods better?” based on a dataset with the triples Develop method (agile or traditional), Productivity (FP per working day), and User satisfaction (dissatisfied, satisfied, very satisfied). The thing the fifty developers did not know was that all datasets randomly generated.

See this presentation where the whole background is explained, yet to go short, the developers showed a strong bias in favor of agile methods. The agreement in their claim depended on their previous believe in agile methods. Due to a lack of objective measurement and more variables of importance the expectation is that the real-life bias is probably much stronger than in the experiment.

Evidence-based software engineering is important!

Magne Jørgensen explains what his keynote is about: “The keynote reflects on what it in practice means to be evidence-based in software engineering contexts, where the number of different contexts is high and the research-based evidence sparse, and why there is a need for more evidence-based practices”. He argues in his presentation that evidence-based software engineering is needed to replace bias believes and opinions with evidence, and challenge existing practices. But at the same time he tells us that the main problem is that it is so difficult to convince software engineers from the importance to do so. Something I recognize from my own practice too.

The main problem is that it is so difficult to convince practitioners from the importance of evidence-based software engineering

This all reminded me of a paper that I started on some time ago, but that for some reason that I forgot myself never reached the publication status. I intended to use the upcoming modernism in the early 20st century as a metaphor for how innovative, new approaches in developer societies could be seen in the light of evolution of software engineering. The source for this idea was “Modernism: The Lure of Heresy” by Peter Gay; a both fascinating and instructive book that’s by The New York Times was reviewed to be a “massive history of the movement in all its artistic forms — painting, sculpture, fiction, poetry, music, architecture, design, film (though, bafflingly, not photography, one of the chief catalysts of the modernist revolution)”


Modernism is a blueprint for innovation for modern enterprises

Modernism led to one-and-a-half century renewal of all art movements from the end of the nineteenth century onwards. The new contemporary art that originated influenced a large number of followers. This caused many artists to change (to transform) in the direction that the innovators had pointed them. Modernism is thus essentially a blueprint for innovation for modern enterprises.

It all started with the professional outsider

In the early days of modernism some artist chose consciously the role of 'professional outsider’, which changed the cult of art in the cult of the artist. It was a non-uniform group of individuals who in a later stage began to change in so-called avant-gardes, as a kind of collective individualism. This led over time to an enormous diversity of art: impressionism, cubism, abstract expressionism, action painting, color field painting… Many modernist artists felt strongly attracted to heresy and rejection of the middle class. In addition, they constantly subjected themselves to principled self-examination.

A constant balance between conflicting extremes

A common feature of all identified strengths of modernist artists is that they constantly seemed to balance between conflicting extremes. As for example Csikszentmihalyi describes in Flow: The Psychology of Optimal Experience, creative people often are both physically energetic and modesty, clever and at the same time naïve or they are swinging between imagination and a high sense of reality. 

An excellent breeding ground for innovative art

The period around the end of the nineteenth century was an excellent breeding ground for the emergence of innovative art. Gradually a liberal state developed who knew little strict censorship and oppression. The social and cultural conditions of existence for modernism were thus present. Partly as a result of industrialization and urbanization, there was a growing middle class. The quest for happiness and freedom of the middle class had a wave of democratization as a result. A growing power of third parties, art critics, art dealers, and museum directors, led to a transformation of art from a by aristocracy and government driven art world to the public environment.

A model for innovation

It seems from the above that renewal processes follow a relatively fixed pattern. I made up a figure that shows the development within an innovation process, based on the events that occurred in the growth of the modernism:


A translation to agile

Taking the risk that I am called a dreamer, I think it is an intriguing thought whether the above sketched model can be applied on the way agile development evolves in industry and research nowadays too; since agile development also was a movement that was raised by some professional outsiders that organized themselves in loosely coupled avant-gardes. Think of the group of twelve people that signed the agile manifesto, some of them having almost reached a guru status. Like the professional outsiders in the early 19th century they were driven by the rebelling against a growing middle class; in our software engineering world represented by a traditional way of working, whatever the name ‘traditional’ exactly means. This mainly revolves around highlighting contradictions; look at the way the agile manifesto emphasizes contradictions by stating at the same time that there are no real contradictions:

Individuals and interactions over processes and tools

Working software over comprehensive documentation

Customer collaboration over contract negotiation

Responding to change over following a plan

Are we moving towards massive diversity as a next step?

In the modernism the next phase after the forming of avant-gardes was a movement towards a huge number of different trends in art (also referred at as “isms”). Are we now moving towards such massive diversity in agile environments? Does not seem to be something surprising I would say. Agile is already supposed to be a catch-all-term that covers methods (or should I say frameworks, or approaches, or whatever?) such as Scrum, XP, and many others. And when looking for example at Scrum; I encounter many different implementations of Scrum in practice. A clear and standardized guideline on how to implement Scrum is not to be found so every Scrum coach (and there are many) seems to propagate its own ideas, opinions, guidelines, workshops, and (in limited cases) facts.

So, I think we are entering a period of on the one hand extreme freedom for development teams and great performances measured in velocity, yet on the other an often tangled up number of development approaches and related concepts like Scrum, Kanban, DevOps, Continuous Deliver, Continuous Integration, and whatever new terms will pop up the coming months. We should be prepared to meet entanglement and resistance, and think about solutions how to handle with this.

Therefor as a help for people struggling too, some recommendations based on the evolution that occurred during the modernism.

7 recommendations for companies that aim for innovation

As a software engineering community we can learn from this energetic and innovative period. From the development during the modernism into a massive number of different trends in art I filter out seven recommendations for companies that like to achieve economic growth and innovation by thinking and working as an artist:

  1. Encourage the role of the professional outsider.

  2. Cherish avant-gardes; groups of like-minded outsiders.

  3. Facilitate substantive quests along the borders of the middle class.

  4. Start a search to innovation when the spirit is right.

  5. Accept rejection and seek deliberately to controversy.

  6. Take calculated risks and accept the consequences.

  7. Use powerful third parties 'insurance' against risks.

Thinking over this inevitable link between modernism and agilism I created a word-cloud on all future trends in agile-art that will pop-up the coming period. Use it freely in any presentation or talk on new trends in software engineering. I guarantee instant success!


Hits: 4669

A question pops up every now and then since I started my PhD at Delft University of Technology: whether a relation can be found between the functional size of a software project and the delivered value of such a project. In former research I already experimented a lot with measuring functional size of software projects. For this purpose we used an ISO-standardized method for Functional Size Measurement; a bit of a pimped name for counting function points.

Based on functional size, supplemented with some core metrics such as project cost, project duration and the number of defects found during the delivery period of a project, we built a so-called cost/duration matrix. The matrix proved to be a great instrument for benchmarking purposes.

We are building an R-script that creates a Cost/Duration Matrix

In our research group we are at the moment building an R-script that creates a cost/duration matrix where all projects in our measurement repository – we have about 400 validated software projects in scope – are plotted in one of the four quadrants.  The distance from the average line with regard to project cost (vertical line) and project duration (horizontal line) indicates the percentage of deviation from the mean as depicted in a single regression plot with power fit.


A Cost/Duration Matrix build from four quadrants

The cost/duration matrix is built from four quadrants. The Good Practice quadrant (top right) holds projects that performed better than the average of our measurement repository as a whole for both project cost and project duration. Here you’ll typically find projects that were performed in an agile way, with fixed, experienced teams, a steady heartbeat, and release-based way of working mapped on a single application (see our research paper How to Build a Good Practice Software Project Portfolio for all details).

At the bottom left you’ll find the Bad Practice quadrant; showing software projects that performed worse than average for both project cost and project duration. These projects are typically characterized as rules- and regulations driven, migration projects, implementations of projects with lots of new technology, and security driven projects.

On the top left and the bottom right you’ll see two remaining quadrants; resp. Cost over Time and Time over Cost. These quadrants hold projects where, as the name shows, apparently project cost was preferred by the stakeholders or project duration was looked upon as more important. However, since we did not focus our research at these two quadrants, it is a bit unclear why some projects are in the Time over Cost quadrant, and others are in the Cost over Time quadrant.

The colour of the dots indicates the quality of a project

As a kind of add-on the plot gives every dot that represents a single software project a colour that indicates the process quality; green dots stand for projects with relatively less than average defects, red dots represent projects with relatively more than average defects. For black dots no defects where available in our repository. And what we already revealed in our former research is clearly visualized here; software projects in the Good Practice quadrant tend to have less than average defects and projects in the Bad Practice quadrant show more defects than average.

Apparently time, cost, and quality go hand in hand in software projects

I use the Cost/Duration Matrix a lot in practice with my customers. I like it especially for its capability to visualize the effects of internal and external benchmarking software projects in a company’s project portfolio. The matrix gives a clear view on any deviation of a company’s project portfolio’s performance of that in our measurement repository that’s filled with a variety of software projects performed within peer group companies.


The question of value pops up every now and then

Yet, the question on value keeps popping up every now and then. Imagine a software project that ends up in our Bad Practice quadrant, but at the same time it delivered lots of value to the company after the project went live. Or, another example that I hear a lot: what about WhatsApp? An application that delivers only limited functionality to its users, and therefore is measured in a low number of function points. But the delivered value, measured in return on investment, is huge. How come Bad Practice?

WhatsApp in the Bad Practice quadrant? Don’t make me laugh!

That’s why I am experimenting a bit with all kinds of measures for value that I walk into in practice. Today I did a small experiment with a so-called Strategic Score that is used within the project office of a software company to prioritize software projects in a project portfolio. The Strategic Score is built from a Group Score and within a group a Priority Ranking. I walked by one of my favourite project officers and asked her to explain to me how this Strategic Score worked. 

“The Group Score is measured from one to six. One stands for the highest priority projects, the score for this group is set manually by the executive board of the company in a monthly meeting. Group Score two to five is determined automatically based on a list of project characterizations such as brands, challenger DNA, cost leadership, digital, hardware, partnerships, and price leadership. Group Score six represents projects with no score calculated at all.”

“Within each Group Score a Priority Ranking is set; a number that indicates the rank of a specific project within a group. Priority Ranking of a specific project is calculated automatically based on the value of indicators such as Benefit (possible values none, low, medium, high), Budget (possible values none, low, medium, high), and Time-to-Market (the number of months to delivery date).”

Are you still with me?

However I was a bit lost after the explanation above I decide to plot the Strategic Score of software projects in my repository against the functional size of these projects in Function Points. This is what came out:


Okay, I couldn’t make too much out of it. It does not indicate any significant correlation between Project Size and Strategic Score. Maybe the range of projects with a Strategic Score of about 2.10 form a kind of group? Unfortunately the experts of the project office couldn’t help me out a lot. They seemed a bit puzzled by the way the Strategic Score was calculated anyhow. A bit frustrating though when you’d realize that the executive board stares at these numbers at least once every month in their strategic meetings…

I need your help!

Yes, you can help me here! Please tell me what measures your company uses to indicate the delivered value of software projects. Just to be sure, I already know about Story Points, yet these cannot be compared over companies or even over different software delivery teams. But in case you have great experiences with any other nice value measures – or in case you experienced major failures with some – please let me know. I’ll keep you posted. And by the way, in a few weeks we will share our R-script to create a Cost/Duration Matrix with you via GitHub so you can try this all yourself in your own company.

Hits: 5714

Recently a committee of members of the Dutch parliament, assisted by a team of researchers, published the results of an inquiry in the backgrounds of cost spend on IT-solutions by the government in The Netherlands. As everyone expected the outcome was not good. The Dutch federal government does not have control over its IT projects, the government does not realize it themselves but IT is everywhere, the government does not fulfill its ambitions with regard to IT, and IT tenders contain perverse incentives, to name some of the remarks made.

As a solution the committee invented ten so-called Bureau IT (abbreviated BIT) review rules. For each IT project that’s estimated to cost more than 5 million euros the BIT (a newly to be set up team of IT governance specialists) is going to assess the project plan.  The BIT performs like a kind of lock; only when the traffic light is green a project can be started.

What exactly is a project here?

Besides the fact that I doubt whether setting up another bureaucratic assessment team will really help the Dutch to get a grip on their IT projects, one of the first things that struck me was the somewhat absurd idea of only assess projects that were estimated to cost more than 5 million euros.  I simply can’t reconcile this with what I see in my daily practice as a software economics specialist. Maybe the underlying question is “what exactly is a project here?”

For the research we do at Delft University of Technology on evidence-based (quantitative) project portfolio management we collected a measurement repository which is currently filled with data of almost 400 IT projects of different organizations from the banking and telecom sector in The Netherlands and Belgium. The total project cost in the dataset is over 270 million. The average cost of an IT project in our dataset is more than seven hundred thousand euros. Only six projects in the repository did cost more than five million euros; together they form 13.5% of total project costs.

Only six out of 400 projects fall within the scope of the committee

Translated to the bold government plans this would mean that only a minimum top layer of all IT projects falls within the scope of the BIT assessment. Apart from that, yes it’s already being mentioned by a lot of others too, a limit of 5 million euro creates a perverse incentive for stakeholders to go for projects budgeted just below that limit.  All this gives me the impression that the committee does not really know how IT projects nowadays are carried out.

I would advise the commission - and decision-makers on IT projects in central government - to look a bit more abroad. The Dutch are positioning themselves with this approach as an old-fashioned thinking IT-grandfather; "Okay, when we first get our things in order, then the problem will be solved by itself." Look instead at how the big banks, telecom companies but especially international IT innovators like Facebook, Google and in The Netherlands show that process-oriented thinking is not going to work. Oh, and by the way, make real work of educating IT experts in the economics of IT.

Hits: 5534

Posted by on in Research Topics

Lately I was writing a draft version of a research paper on software engineering that was about how a telecom company applied a strictly statistical way for the pricing of all project proposals that were submitted by their main Indian IT-supplier. The basic idea of the approach was that no expert judgment was allowed because we had the idea that this would hinder continuous improvement and that it caused non-transparency in the proposal process. I was hired by the company in helping them to improve the productivity of their software projects. This is how I started of the research paper:


"The central premise of Michael Lewis’ bestseller Moneyball; The Art of Winning an Unfair Game is that the Oakland Athletic baseball team, led by its manager Billy Beane, being in a disadvantaged revenue situation, made a glorious revival once an analytical, evidence-based, sabermetric approach was chosen to assemble a competitive baseball team. The collected wisdom of baseball insiders over the past decades proved to be subjective and often flawed. Moneyball begins with an innocent question: “how did one of the poorest teams in baseball, the Oakland Athletics, win so many games?” Lewis’ answer starts with an obvious point: “in professional baseball it still matters less how much money you have than how well you spend it”.

And then something curious happened

I asked some of my colleagues of the Software Engineering Research Group at Delft University of Technology to review the draft paper and most of them were intrigued by the somewhat unusual pricing approach that was chosen and by the link that I made to the use of sabermetrics in baseball (The term sabermetrics refers to the empirical analysis of baseball, especially baseball statistics that measure in-game activity. The term is derived from the acronym SABR, standing for the Society for American Baseball Research).

Yet during the talks we had on possible improvements on the paper a discussion was raised by one of my co-authors on whether this link, and the prominent place at the beginning of the Introduction section I’d given it, would help once the paper was submitted to the International Conference on Software Engineering (ICSE). Being an insider-expert in ICSE acceptance he had the strong feeling that the link with Moneyball could easily be seen as simply too much a curiosity for serious software engineering researchers. For those readers unfamiliar with ICSE; the conference can be seen as the flagship conference for software engineering researchers. Everybody that takes his research job serious wants to publish here; it is not surprising that the acceptance rate of the conference is not one of the highest.

Apparently boredom and risk mitigation go hand in hand

So I took the advice seriously and we agreed to remove all references to Moneyball from the paper. I felt a bit in an agony of doubt about it because the paper became somewhat more boring too. Apparently boredom and risk mitigation go together. To cut a long story short, we finalized the research paper and submitted it to the conference. I’m now waiting for the results of the jury. “Armenia, may we have your votes please?”

Billy Beane can’t have enough of soccer after revolutionizing baseball

Yet the story continues. Yesterday I received a forwarded email that my co-author got from a Canadian professor in software engineering. Referring to a recent interview with Billy Beane by Sean Ingle in The Guardian, he came up with the remark that “ever since reading about him I have wondered how we can apply his principles to software development”.

As Beane puts it: “Our aim is to properly allocate credit and blame to a player. In baseball you can do something poorly and still get credit. A pitcher could throw a bad ball, the batter hit a screaming line drive, and an outfielder make a fantastic diving catch. Yet when you look at historical databases, 80% of the time when a ball is struck with that trajectory and velocity it is a hit. So because a superior defender caught it on that play, you should probably credit the hitter in some way and take away from the pitcher. Traditional stats don’t do that. They only credit outcome. They don’t credit process.” The Canadian professor ends by saying “I believe that software development is a team sport, but we don’t have much research on what the different types of players are. And even less in quantifying their production. So one of my dreams is to see our field develop sabermetrics.”

I even found an earlier interview with Billy Beane by Simon Burton, also in The Guardian that reveals an intriguing story about the guy that helped Beane with his statistics. “beane’s revolution at the A’s was assisted by Paul dePodesta, a tall, thin Harvard-educated number-cruncher (who refused to lend his name to the film, and thus appears as Peter Brand, a short, overweight Yale-educated number-cruncher) who is now vice-president of player development and amateur scouting at the new York Mets.” A fine example of how statisticians are put down as total nerds I would say. 

“So one of my dreams is to see our field develop sabermetrics”

Yes! We share the same dream here. Let me sketch a picture of mine. All software companies, big or small, collect a standardized set of metrics of all software projects that they perform and store these data in a measurement repository that’s used after analyzing specific trends for estimates of newly to be started software engineering activities, whether these are performed as waterfall projects, Scrum sprints or release-based software enhancements. And since they all use a standardized set of measurements – yes, the sabermetrics of software engineering – external benchmarking finally becomes more than a wet dream for measurement geeks.

So, let’s be frank about this. When my research paper is not going to be accepted by the ICSE reviewers I can always blame it on the fact that the software engineering research community itself created a world where people think it’s stupid to refer to great, but maybe unusual ideas that popped-up in other worlds. Aargh! Okay, maybe it’s just a lousy paper… And when it is accepted I can always shout it out loud when I am presenting my story at the conference itself. And be sure I’m gone show them all the trailer of the movie that’s been made about Moneyball and point out the revealing gesture that Billy Beane – played by Brad Pitt – makes when he is bored with the ongoing judgments of a room full of hearing-aid wearing experts.

Hits: 6363

In 2013 I published a book in Dutch, named Agile Works; measure, analyze and benchmark IT-projects. Some things didn't make it to the book. Because they did not matter, or because they did not contribute something directly to the message of the book, or simply because it did not fit within the number of pages (the publisher had previously indicated how many words the manuscript had to be ...). Now what intrigues me; I published the parts that were left out on the Dutch part of my website, and guess what? In an instant this was the most readed article of all. That’s why I don’t want to withhold this story about a great Dutch painter for the rest of the world.

The reality of Karel Appel

In 1961 the Dutch filmmaker Jan Vrijman made the documentary The reality of Karel Appel that for many years was the stereotypical image of Apple as crude beast of a painter has confirmed. Under the heading 'barbarian art in a barbaric time’ the rousing and evocative film carefully constructs an image of a common people boy living and working in an attic above the city, claiming the right to make real, earthy and probably misunderstood art. We see a raging Appel, screaming and grunting, obviously wrapped in a rugged battle with a violent canvas. He smashes the paint on the canvas, attacks it with a palette knife and emphasizes the naivety of his way of working by painting regularly with his left hand (Appel is right-handed). He finishes the paint fight with the battle cry "I don’t paint! I hit!" And after that he triumphant leaves the arena to drink a cup of tea.

More than twenty-five years later Vrijman examines in his documentary Cobra, a revolt against the order what has left of the ideals of the Cobra movement. He encounters a completely different Karel Appel. Although still combative, common and pronounced we see a painter who thoughtfully builds his huge canvas. Appel still spurts the paint directly from the tube onto the canvas. It’s a delight to see him rubbing and scratching. But the brutal confrontation of Appel looking directly to the viewer - in the first documentary Vrijman filmed Appel through a hole in the canvas - is missing. Here we see a painter who shows what really matters him. He wants to be alert while the world sleeps; "...Being awake means being clear… that you clearly see life again, any time of day… To be awake means being alert. Your intelligence, your instinct, your radar, you should always be open to radar, ready to receive everything.”

Would that also be true for the world of IT? That you can make beautiful things by working from your gut and your intuition? Apparently Karel Appel found that this approach is certainly valuable and even essential, but it's not the only thing that counts. I surely believe this is true too; it's good to continuously measure and analyze what you're doing. Wakefulness is important. And that's basically why I think that agile often works so good. Agile delivery approaches combine intuition with ratio. Agile companies deal with uncertainties and risks in a smart way.

Make sure you’re radar is open

“Okay, a likely story” you would say. But isn’t that a bit too easy? How fair is that reality of Karel Appel really? Is not much of what we see false and based on many purious assumptions? And what of that reality is founded on reason and facts, on real observations? Karel Appel manipulated reality in the first documentary in a clever and visually convenient way; he let the cameraman filming himself through a hole he had made in the middle of the canvas so it looks like he is not attacking the canvas with paint, but the viewer itself. Does the documentary give a real picture of the life and work of an artist, or provokes Appel the viewer by consciously building the cliché of the tormented artist?

Yet I assume that for Karel Appel the ultimate result that counts is the painting itself. That's what you see, that’s what you perceive. And it is important that subjectivity and objectivity are looked upon separately. Those who only base their opinions on quick assumptions will probably say “my little sister can do that!” Yet an objective viewer will realize that Appel looked at how children painted and used that as an example of innocence and naivety in his paintings. Smart observers might notice that some paintings by Appel are hanging upside down in museums. But who objectively looks will probably find out that some paintings actually hang upside down because the paint never dries inside and flows down slowly.

'An Appel is just a tube of paint that's not squeezed-out  completely' 

Allright, let's think broad-minded and translate this to the world of IT. As a specialist in the field of measurement, analysis and benchmarking of IT-projects I often see how bias and subjective opinions rule decision making in companies. But besides that I always try to have an objective look at the results of the projects. And I trust in the power, skills and knowledge of the real professionals; the men and women who build and enhance the information systems.

Twenty-five years ago, when I made the transition from art to information technology, I felt I had landed in a completely new, different world where nothing was the same as in the world of art that I knew so well. I stepped out of a world of non-ratio and “you don’t need a plan, just start! There’s no predefined end-result…” into a new, unknown universe of ratios within all laws, rules and regulations seems to be different. Yet, if you look at it from a distance it's good to finally notice that even in these different worlds many things are the same. Would that be a learning cycle too?

Hits: 4902
Powered by EasyBlog for Joomla!