Tuesday 15 December 2015

Build intuition around quality metrics

When you hear that a car drives at a speed of 250km/h or that a tree has 2 meters height you can intuitively classify it as fast/slow or short/tall.

When you hear that Cyclomatic complexity of you code is 2.7 or LCOM4 is 1.6 can you do the same? For many developers the answers is unfortunately - no - because in general there is no such real life intuition about programming. Many times I saw people surprised when they found four embedded ifs in their code. It looks like It appeared from nowhere.

We are not born with internal knowledge about surrounding code. Here you have first article returned by google which claims that we are actually born with some intuition about physics : Babies are born with 'intuitive physics' knowledge, says researcher. Some knowledge about world around us was quite useful for many generations - on the other hand Computers are relatively new thing in the history of human species.

So the idea of this exercise is to actually observe how complex code is created step by step and also to learn how some approaches helps keep value of this metric small. Code is used only for demonstration so some bugs may appear in it.

Cyclomatic Complexity A.K.A Hadouken code

Wikipedia says that cyclomatic Complexity can by defined as "It is a quantitative measure of the number of linearly independent paths through a program's source code." but maybe more intuitive for practitioners will be fact that when CC=2 you need to write 2 tests, CC=3 3 tests - etc.

Sounds trivial but now we are going to learn that Cyclomatic Complexity has non-linear nature which means that if you couple two pieces of code with CC=2 you may obtain code with CC>4. Let's see this on the battlefield.

We are going to start with Java because it has maybe the best tools to measure complexity and than we will see what we can measure in Scala Code.

Battlefield

Out experimental domain is very simple. Notice that at this stage we build just types with values. I used call those DTOs in my old Java days. Now I'd call them Records. Later we will see how nature of those classes will change during OOP refactoring.

class User{
    public final String name;
    public final Integer age;
    public final Gender gender;
    public final List<Product> products;

    public User(String name, Integer age, Gender gender, List<Product> products) {
        this.name = name;
        this.age = age;
        this.gender = gender;
        this.products = products;
    }
}

enum Gender{MALE,FEMALE}

class Product{
    public final String name;
    public final Category category;

    public Product(String name, Category category) {
        this.name = name;
        this.category = category;
    }
}

enum Category{FITNESS,COMPUTER,ADULT}

Laboratory

We are going to develop and measure a piece of code which simulates transformation of business object into it's text representation. It's more than enough for this experiment so let's see the first version.

  public String complexMethod(User u){
        String value="";

        if(u.age > 18){
            value="Adult : " +u.name;
        }else{
            value="Child : " +u.name;
        }

        return value;
    }

We can easily measure CC of this code and to do this we are going to use Sonarqube 4.5.6



And also "Metrics Reloaded" plugin will be usable in some places.

So after the first measurement we receiveCC=2 - we have just one if statement so we need two tests.

CC=4

Now let's add another conditional which execute different actions according to Gender property.

       if(u.age > 18){
            if(u.gender==Gender.MALE){
                value="Adult Male: " +u.name;
            }else{
                value="Adult Female: "+u.name;
            }
        }else{
            if(u.gender==Gender.MALE){
                value="Child Male: " +u.name;
            }else{
                value="Child Female: "+u.name;
            }
        }

Now let's try to guess what is CC of this code. Following path of execution are possible.

  1. IF -> IF
  2. IF -> ELSE
  3. ELSE -> IF
  4. ELSE -> ELSE

Sonar agrees.

CC=5

Our logic is expanding. Now we need to also add information about products...but only for User who is Adult Male.

public String complexMethod(User u){
        String value="";

        if(u.age > 18){
            if(u.gender==Gender.MALE){
                value="Adult Male: " +u.name;

                for (Product p: u.products) {
                   value+="has product "+p.name+",";
                }
            }else{
                value="Adult Female: "+u.name;
            }
        }else{
            if(u.gender==Gender.MALE){
                value="Child Male: " +u.name;
            }else{
                value="Child Female: "+u.name;
            }
        }

        return value;
    }

Technically sonar just counts number of for and ifs to calculate CC but we can understand result of CC=5 this way :

  • For empty collection nothing changes : CC+0
  • For non empty collection we have another possible path : CC+1

CC=6

Let's make things more interesting by adding filtering condition to products.

 public String complexMethod(User u){
        String value="";

        if(u.age > 18){
            if(u.gender==Gender.MALE){
                value="Adult Male: " +u.name;

                for (Product p: u.products) {
                    if(p.category!= Category.ADULT) {
/*HAAADUUUKEN  ~~~~@ */    value += "has product " + p.name + ",";
                    }
                }
            }else{
                value="Adult Female: "+u.name;
            }
        }else{
            if(u.gender==Gender.MALE){
                value="Child Male: " +u.name;
            }else{
                value="Child Female: "+u.name;
            }
        }

        return value;
    }

We have another single if in our code. Cyclomatic complexity is CC=6 which may seem small but it isn't. We already have Arrow Code Anti pattern. and context awareness of this piece of (sh...) code makes it almost impossible to reuse anywhere else.

Soon we will see how this way of coding rises complexity in more exponential than linear way. But first let's look at something called essential complexity.

Essential Complexity

What if we don't want to initialize mutable variable but we would like to return from withing embedded ifs?

public String complexMethod(User u){
        if(u.age > 18){
            if(u.gender==Gender.MALE){
                String value="Adult Male: " +u.name;

                for (Product p: u.products) {
                    if(p.category!= Category.ADULT) {
/*HAAADUUUKEN  ~~~~@ */    value += "has product " + p.name + ",";
                    }
                }

                return value;
            }else{
                return "Adult Female: "+u.name;
            }
        }else{
            if(u.gender==Gender.MALE){
                return "Child Male: " +u.name;
            }else{
                return "Child Female: "+u.name;
            }
        }

    }

Now Sonar will return CC=10 because technically complexity in sonar is not just Cyclomatic Complexity but CC + Essential Complexity (and maybe + something else). Here we will receive better measurement with metrics plugin.

So Complexity=CC+EC=6+4=10 . Essential Complexity - in my understanding it measures number of places where your logic execution can end. So because we have multiple returns it makes code more difficult to analyze. If this statements is correct is a matter of discussion but generally since I started learning Functional Programming and thinking in expressions I believe I haven't used such construction like return in the middle of the code. (BTW now I'm going to change syntax color to have a comparison what is better)

Ok we see the problem , now let's find a cure.

Refactoring OOP way

The easiest thing at the beginning is to move logic responsible for displaying product to a dedicated component. It's not easy to think about proper domain abstractions where actually we don't have any domain problem but let's try with something generic.

interface ProductPolicy{
    boolean isCensored(Category category);
}

interface PolicyFactory{
    ProductPolicy create(Collection<Category> forbiddenCategories);
}

interface ProductDisplayer{
    String display(Collection<Product> products);
}

So there is a policy which may be configured through factory. And we have interface for our displayer which may look like this:

class CensoredDisplayer implements ProductDisplayer{

    private ProductPolicy productPolicy;

    public CensoredDisplayer(ProductPolicy productPolicy) {
        this.productPolicy = productPolicy;
    }

    @Override
    public String display(Collection<Product> products) {
        String result="";
        for (Product p: products) {
            result+=addToDisplay(p);
        }
        return result;
    }

    private String addToDisplay(Product p){
        return productPolicy.isCensored(p.category)? "" : " has product "+p.name+",";
    }
}

Now let's take a look at our laboratory code.

 private ProductDisplayer productDisplayer;

    public String complexMethod(User u){
        String value="";

        if(u.age > 18){
            if(u.gender==Gender.MALE){
                value="Adult Male: " +u.name;
                value+= productDisplayer.display(u.products);
            }else{
                value="Adult Female: "+u.name;
            }
        }else{
            if(u.gender==Gender.MALE){
                value="Child Male: " +u.name;
            }else{
                value="Child Female: "+u.name;
            }
        }

        return value;
    }

Complexity of this code is CC=4 and complexity of Displayer is CC=1.7 so technically whole system is a little bit less complex already. (And "Beans" is the name of a class where I put all interfaces)

OOP data types

To move further we can change nature of data types from records into richer entities and use inheritance polimorphism to dispatch execution between specific pieces of code.

Check the code.

abstract class User{
    protected final String name;
    protected Integer age;
    protected final List<Product> products;

    public User(String name, Integer age,  List<Product> products) {
        this.name = name;
        this.age = age;
        this.products = products;
    }

    abstract String introduceYourself();

    abstract List<Product> showProducts();
}

class MaleUser extends User{

    public MaleUser(String name, Integer age, List<Product> products) {
        super(name, age, products);
    }

    @Override
    String introduceYourself() {
        return (age>18?"ADULT MALE":"CHILD MALE") + " : "+name;
    }

    @Override
    List<Product> showProducts() {
        return age>18? products: new LinkedList<>();
    }

}

class FemaleUser extends User{

    public FemaleUser(String name, Integer age, List<Product> products) {
        super(name, age, products);
    }

    @Override
    String introduceYourself() {
        return (age>18?"ADULT FEMALE":"CHILD FEMALE")+ " : "+name;
    }

    @Override
    List<Product> showProducts() {
        return new LinkedList<>();
    }
}

What is important here is to notice how context information was now moved to children of (now) an abstract class User. Context information is now encapsulated inside classes and this construction reduces Cyclomatic Complexity in the place where objects are used. Look at this:

 public String complexMethod(User u){
        String result=u.introduceYourself();
        result+=productDisplayer.display(u.showProducts());
        return result;
 }

So once again - we take control over context awareness of particular logic by moving this logic inside classes. Execution is controlled by polymorphic method dispatch.

Refactoring FP Way

We are going to start in similar way as we did in an OOP example. So at the beginning let's move logic responsible for filtering and displaying products into dedicated functions.

Function<Set<Category>,Function<Category,Boolean>> policy=
            forbiddenCategories -> category -> !forbiddenCategories.contains(category);

Function<Category,Boolean> adultPolicy=policy.apply(new HashSet<>(Arrays.asList(Category.ADULT)));
  

Function<Function<Category,Boolean>,Function<Collection<Product>,String>> displayProducts= policy-> ps ->
        ps.stream()
                .filter(p->policy.apply(p.category))
                .map(p->" has product "+p.name)
                .collect(Collectors.joining(","));

So we have Policy, Parametrized Policy and Product Displayer. Complexity of our lab method will be now reduced to CC=4

public String complexMethod(User u){
        String value="";

        if(u.age > 18){
            if(u.gender==Gender.MALE){
                value="Adult Male: " +u.name;

                displayProducts.apply(adultPolicy).apply(u.products);
            }else{
                value="Adult Female: "+u.name;
            }
        }else{
            if(u.gender==Gender.MALE){
                value="Child Male: " +u.name;
            }else{
                value="Child Female: "+u.name;
            }
        }

        return value;
    }

Now let's introduce some conditional logic into next functions mainly to check how good is sonar in measuring CC of Functions.

Function<User,String> ageLabel= u -> {
        if (u.age>18)
            return "ADULT";
        else
            return "CHILD";
    };

Function<User,String> introduce=u-> {
        String result="";
        switch(u.gender) {
            case MALE:
                result= ageLabel.apply(u) + " MALE" + " : " + u.name;
                break;
            case FEMALE: result= ageLabel.apply(u) + " FEMALE" + " : " + u.name;
        }
        return result;
    };
Function<User,Collection<Product>> getProducts = u-> {
        if(u.gender==Gender.MALE && u.age>18 )
            return u.products;
        else
            return new ArrayList<>();
    };

And finally let's see how all this functional machinery can help us!

//composition!!
Function<User, String> productDisplayer = getProducts.andThen(displayProducts.apply(adultPolicy));


public String complexMethod(User u){
     return introduce.apply(u) + productDisplayer.apply(u);
}

This looks wonderful - and now good and bed news.

We have reduced Complexity of our lab function to CC=1 but unfortunately Sonar is unable to measure complexity inside functions. I tried Sonar 4.X and Sonar 5.X - both without success. The only solution I found to have proper CC measurements is to use method references.

static String introduceMethod(User u){
        String result="";
        switch(u.gender) {
            case MALE:
                result= ageLabel.apply(u) + "pełnoletni" + " : " + u.name;
                break;
            case FEMALE: result= ageLabel.apply(u) + "pełnoletnia" + " : " + u.name;
        }
        return result;
    }
Function<User,String> introduce=BlogComplexityFP::introduceMethod;

We saw how to chandle complexity in Java both in OOP and FP way - now let's quickly check how to measure complexity in the Scala world.

Scala Measurements

I believe that Scala main quality tool is Scala Style. It has plugin for sonar but is not as rich as the one for Java. Generally in Scala Style we can set acceptable Cyclomatic Complexity level and if code exceeds it then a warning will be risen.

So if I have this ugly piece of code

def complexMethod(u: User): String = {
    if (u.age > 18) {
      if (u.gender == MALE) {
        var value: String = "pełnoletni : " + u.name
        for (p <- u.products) {
          if (p.category != Category.ADULT) {
            value += "i ma produkt " + p.name + ","
          }
        }
        value
      }
      else {
        "pełnoletna : " + u.name
      }
    }
    else {
      if (u.gender eq MALE) {
        "niepełnoletni : " + u.name
      }
      else {
        "niepełnoletnia : " + u.name
      }
    }
  }

Then I will only receive a warning

More info will be displayed in the sbt console

ProceduralExample.scala:26:6: Cyclomatic complexity of 6 exceeds max of 1

And finally Scala code with CC=1

object FPExample {

  object Gender extends Enumeration{
    type Gender=Value
    val MALE,FEMALE=Value
  }

  object Category extends Enumeration{
    type Category=Value
    val FITNESS,COMPUTER,ADULT=Value
  }

  import Gender._
  import Category._

  case class Product(name:String,category: Category)
  case class User(name:String,age:Int,gender:Gender,products:List[Product])


  val policy : Set[Category] => Category => Boolean =
    forbiddenCategories => category => !forbiddenCategories.contains(category)

  val adultPolicy = policy(Set(ADULT))

  def displayProducts(policy:Category=>Boolean)(ps:Seq[Product]):String=
    ps
      .filter(p=>policy(p.category))
      .map(" and have product " + _.name)
      .mkString(",")

  val agePrefix: User=>String= u =>
    if(u.age>17) "adult" else "child"

  val introduce : User=>String = _ match {
    case u @ User(name,_,MALE,_) => agePrefix(u) + " Male :" + name
    case u @ User(name,_,FEMALE,_) => agePrefix(u) + " Female :" + name
  }

  val getProducts: User=>Seq[Product]= _ match {
    case User(_,age,MALE,products) if age>18 => products
    case _ => Seq[Product]()
  }

  val productDisplayer=getProducts andThen displayProducts(adultPolicy)

  def complexMethod(u: User): String = introduce(u) + productDisplayer(u)
}

Is it a complex or a simple code? It's difficult to say because Cyclomatic Complexity tends to be not very helpful when lambdas enter the scene. In a couple paragraph below we will think about possible solution.

Non Linear Nature of Complexity

Ok after all those examples and exercises let's try to understand non linear nature of Complexity. We saw that when we are moving logic to external components or functions - we can make them context unaware to some degree. Then they are receiving information about context they are being used in by various types of parametrization.

So if for example if I have a Component A with CC=2 and I want to use this component in some Component B then this action does not raise CC of Component B. Of course I assume that Component A was properly tested and it behaves in predictable way

Now when I would take logic our from Component A and paste it directly into Component B then most likely I can not use this piece of code in Component C because it is already embedded in Context B. This way I raised CC by 4 with the same piece of code.

If it is not clear yet then let's return to this part :

if(age>18){
   if(MALE){...} //1
   else{...}
}else{
   if(MALE){...} //2
   else{...}
}
Both ifs //1 and //2 are similar but they are aware than one is in the context of age>18 and second one in age<=18. So every occurence of similar code will raise CC by 2.

OOP & FP

I hope I also managed to show that proper usage of both OOP or FP mechanism reduces complexity. According to my observation programmers generate complexity mainly not by using wrong paradigm but by using any paradigm in a wrong way - in shorter words - first learn OOP or FP properly and only then discuss FP vs OOP (and of course I should do this too).

Psychological Aspect

Till now we discussed how code complexity influence code itself but can we actually check how this complexity affects developers?

Actually there was an interesting experiment more than half century ago - an experiment which described borders of humans' cognitive limits. Experiment description can be found here : The magic number seven plus/minus two. According to researcher - George A. Miller - Human being can process 7 +/- 2 chunks of information at once. You can find more info on wikipedia to better understand what exactly is chunk of information

The answer to question "why developers have to write code with bugs" - because developers are (still) humans ;)

Summary

I hope that this "complexity adventure" will be helpful to those who are curious how their code obtain "arrow shape" and where complexity is lurking. Also we checked how usage of FP influences Cyclomatic Complexity and a limitation of procedural metrics in FP world. It opens up an interesting question about how to measure complexity of functional code. Very often functional constructs build solution without ifs and fors which are base for CC calculation.

What can we do?

  • build more intuition around scala style rules
  • research functional programming complexity measurements papers
  • Some materials for further investigation:

    No comments:

    Post a Comment