Archive for the Category analysis

 
 

Software, structure, evolution?

The structure of systems and their evolution is an often overlooked aspect of a product/organization, but it can be terribly important even in the “malleable” field of software.In a working paper by MacCormack et al. we are able to see the structure of many software products and see how the “core” evolves over time.

These issues are especially pertinent to the context of software, given that legacy code is rarely re-written, but instead forms a platform upon which new systems are built. With such an approach, today’s developers bear the consequences of design decisions made long ago. Unfortunately, the first designers of a system often have different objectives from those that follow, especially if the system is successful and therefore long lasting. While early designers may place a premium on speed and performance, later designers may value reliability and maintainability. Rarely can all these objectives be met by the same design.
Don’t forget such decisions can last for the life of the product! For example Windows still has some problems with non-English folder/file names, legacy of the old MSDOS days! For a further discussion in the realm of architecture read Brand’s wonderful “How buildings learn”.

Business Scenarios for Cross-functional Communication

iSixSigma features an interesting article on communicating processes through Business Scenarios. As people are pattern-matching, narrative-absorbing animals, the tool is very effective:

If you start with the simplest scenario where nothing goes wrong (sunny-day) then you can rapidly walk the whole process. You can add complexity as you need to….This approach really opens up the discussion as you are talking to people in the language they relate to. You get to see the true degree of variation required of the process which allows for more robust solutions.

Let the data work it out itself

Chris Anderson’s article in Wired is about the notion that vast amounts of data (in the order of Petabytes) will render models superfluous. The rationale is that in very complex systems for which vast data can easily be collected, it is more efficient to let the data make the model rather than devising it ourselves; or in the words of Google’s research director Peter Norvig: “All models are wrong, and increasingly you can succeed without them.”

Anderson believes we have overcome a critical point were the computing/storage power we have is enough to do this:

At the petabyte scale, information is not a matter of simple three- and four-dimensional taxonomy and order but of dimensionally agnostic statistics. It calls for an entirely different approach, one that requires us to lose the tether of data as something that can be visualized in its totality. It forces us to view data mathematically first and establish a context for it later. For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn’t pretend to know anything about the culture and conventions of advertising — it just assumed that better data, with better analytical tools, would win the day. And Google was right.

The problem is that when the model’s causality is unknown (since we didn’t design it in the first place; the data did), we can never be sure when it will misfire: the black swan problem. While such models may work in domains like marketing, biology etc., where the cost of mistakes is low, it cannot be trusted in mission critical functions (see quant funds and subprime crisis).

Από τα δεδομένα στη γνώση

Πολλά έχουν γραφεί για το πως μπορούμε να παρουσιάσουμε πλούσια δεδομένα, έτσι ώστε να τα κάνουμε όσο το δυνατό πιο κατανοητά. Πέρα όμως από τις γενικές αρχές (π.χ. Tufte) υπάρχει και ένα κομμάτι τέχνης, που κάνει μια εξαιρετική δουλειά να ξεχωρίζει, όπως το αλληλεπιδραστικό γράφημα των NYT για τους ψηφοφόρους Obama και Clinton.

Στατιστική και Ψέμματα

Στην πολύ καλή σειρά διαλέξεων του Long Now Foundation, προστέθηκε μια εξαιρετική διάλεξη (mp3) του Nassim Taleb (Black Swan, Fooled by Randomness): ο Taleb ασχολείται με το πρόβλημα της εξαγωγής γενικών κανόνων από ένα πλήθος παρατηρήσεων (απαγωγή), και τα σφάλματα που κάνουμε εφαρμόζοντας την συνειδητά ή ασυνείδητα.

Που χρήσιμεύουν όλα αυτά; Οποιοσδήποτε χρησιμοποιεί το νόμο των μεγάλων αριθμών, παίρνει μέσους όρους ή αναλύει δεδομενα με το νόμο του Pareto, πρέπει πάντα να έχει κατά νου τους περιορισμούς τους, αλλιώς ένας “μαύρος κύκνος” μπορεί εύκολα να τον εκθέσει…