tag:blogger.com,1999:blog-19061269445353379642024-03-05T21:04:35.999+01:00Improve Yourself and Your CodePaweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-1906126944535337964.post-7697640976038637262017-09-05T22:44:00.000+02:002017-09-05T23:20:53.309+02:00Program as a total function - the real nature of bug<p class="akapit">
Things are obvious to us. We live on auto-pilot passing the same things everyday so we stopped wondering what they really are. We got accustomed to them and stop any reflection or everyday analysis.
</p>
<p>
<i>"2+2=4"</i> seems to be obvious. Can there be any surprise? Maybe in language which allows to override "+" operator someone could set a trap but it should not be anythings surprising here. Yet, actually there is a <b>non trivial proof that 2+2 is actually 4</b>. You can find it here -> <a href="http://www.cs.yale.edu/homes/as2446/224.pdf">A real proof that 2+2=4</a>
</p>
<p>
Another situation and another statement. <i>"There is a bug in my program" </i>- this is very common situation so you most likely become used to the context of it and you will not redefine general concepts of <b>"program"</b> and <b>"bug"</b> each time - it's obvious like 2+2=4.
</p>
<p>
But if program is working without any exception is there a bug in it? If there is a bug but no one is able to spot it - can we say that there is a bug? Is a bug the same in theory and practice? To answer this questions we need to investigate concept of "program" and "bug" a little bit deeper...
</p>
<h2>Roots</h2>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwMLKbmxgyhnRAJJQSoU5v40uSwQnvOT1mag1RS4SqG_YH9GUlbSu_omUJpPxS9olaIXV3CP4drtt3cXYvZGlcLV8cAb7WY83OB0wZAlr_-iND9Xgcj-IDtEyjKs0lJE1S4Y7ZLjfjvmwz/s1600/humancomputer.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgwMLKbmxgyhnRAJJQSoU5v40uSwQnvOT1mag1RS4SqG_YH9GUlbSu_omUJpPxS9olaIXV3CP4drtt3cXYvZGlcLV8cAb7WY83OB0wZAlr_-iND9Xgcj-IDtEyjKs0lJE1S4Y7ZLjfjvmwz/s320/humancomputer.jpg" width="320" height="251" data-original-width="1150" data-original-height="901" /></a></div>
<p>
The picture posted above is copied from wikipedia page called <a href="https://en.wikipedia.org/wiki/Human_computer">Human Computer</a>. Yes... "computers" existed even before computers were invented.
</p>
<div class="cytat" style="padding:5px">
The term "computer", in use from the early 17th century (the first known written reference dates from 1613), meant "one who computes": a person performing mathematical calculations, before electronic computers became commercially available.
</div>
<p>
Ha! So not only we need to define program but it is also not clear what is a computer. So now the sentence "I just bought a computer" may as well describe act of purchase of metal box or in the worst case act of slavery.
</p>
<p>
The part where they are described as <b>"a person performing mathematical calculations"</b> is very interesting. So "computer" was performing mathematical calculation - even those very simple like the following one :
</p>
<pre>
fun isRectangularTriangle(a:Integer,b:Integer,c:Integer):Boolean =
pow(a,2)+pow(b,2) == pow(c,2)
</pre>
<p>
So this <i>"mathematical calculations"</i> checks if <blockquote>a^2 + b^2 = c^2</blockquote> or otherwise if triangle is rectangular. But can we be sure it is working that way? What if <i>pow</i> actually triggers just "bip" from computer speaker?
</p>
<p>
What would be extremely useful here is a "law" or "theorem" that for <i>every possible argument we are sure that pow(x,2)=x*x</i> . Of course now we need to define how works this sign "*" but let's assume this is standard multiplication - it is just an introduction to more interesting topics.
</p>
<h2>Can something go wrong? </h2>
<p>
If we look closely at our function of rectangular triangle we may think that obvious problems appear at the ends of <b>int</b> bounds. What is int range? "I know, I know in java doc it is 2 <sup>31</sup>" .Yes. This is an example of seeing world through single programming language lenses. Unfortunately many programmers stays for their whole life with their first commercial language and mistake its limitations with general programming rules. If you switch from Java to Haskell you will see that there can be no limits for an Integer:
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvNOptIrhg3ZJssm1JmPacFg8xi-hCFU7ts-VDyUcHlk5jua5JmhZb_wenno1Vde9ou6bTKVWcvzXXjBLne_OpHBspBtX2jqHcTRAnx-tr2TJQbOzDJnZ5G1iN1x48U9lb_EAByrJd9kl3/s1600/haskell_bignumbers.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvNOptIrhg3ZJssm1JmPacFg8xi-hCFU7ts-VDyUcHlk5jua5JmhZb_wenno1Vde9ou6bTKVWcvzXXjBLne_OpHBspBtX2jqHcTRAnx-tr2TJQbOzDJnZ5G1iN1x48U9lb_EAByrJd9kl3/s640/haskell_bignumbers.jpg" width="640" height="182" data-original-width="1600" data-original-height="455" /></a></div>
<h2>Mathematical proof of reading from a file</h2>
<p class="akapit">
While we theorize about rectangular triangle problem we need to take one step back to not miss a very interesting area. The question is : <b>how a,b and c are actually provided</b> . Maybe there is somewhere csv file with only one line?
<pre>
3,4,5
</pre>
</p>
<p>
Then maybe we can read those values with provided two functions
<pre>
fun read(f:File):String
fun parse(line:String):(a:Integer,b:Integer,c:Integer)
</pre>
</p>
<p>
And then compose those three functions to have one solid mechanism which calculates answer according to provided data file:
<pre>
read andThen parse andThen isRectangularTriangle
</pre>
</p>
<p>
<i>
- Hooooooooooooooold on! And what in case there is an error?<br/>
- what error? <br/>
- file reading error <br/>
- what? .. how??<br/>
- like there is a no file or something <br/>
- so maybe... maybe... we could assume that it is always there ...?
</i>
</p>
<p>
The following problem - very practical - was described with possible solutions in the following abstract :
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKz-SqHseeAqO0llojLVHcJyMs-WTvhnV0epbAOsYI5y2MqbxzWrwaRhRzBshAgdOkCUzZd78pWmsKzB0xDSyrMA3xJF8zQcbAeTaHezS7lkv2EqyZfXBTDVx-VD6oTS3gjU4F1VxkZWd5/s1600/abstract.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiKz-SqHseeAqO0llojLVHcJyMs-WTvhnV0epbAOsYI5y2MqbxzWrwaRhRzBshAgdOkCUzZd78pWmsKzB0xDSyrMA3xJF8zQcbAeTaHezS7lkv2EqyZfXBTDVx-VD6oTS3gjU4F1VxkZWd5/s640/abstract.jpg" width="640" height="168" data-original-width="979" data-original-height="257" /></a></div>
<p>
As it is stated on the screen shot - theoretical knowledge come across serious difficulties when you want to describe something which can <b>break</b>. Ho you can describe mathematically a function of reading file or executing query on the database?
</p>
<p>
solution to this problem was finally found by Eugenio Moggi and abstract above is taken from his work called <a href="https://core.ac.uk/download/files/145/21173011.pdf">Notions of computation and monads
</a>
</p>
<h2>Bug as a function</h2>
<p>
As a reminder : an example of <b>Total Function</b> is an addition because for every argument there is defined result. Contrary to this - division - is not defined for zero so it is <b>Partial Function</b> . When we have a method/function/whatever which claims that is able to divide any two numbers
<pre>
divide(a:Double,b:Double) : Double
</pre>
Then we have an error for. For one specific value but still - an error. Other Examples :
</p>
<p>
<ul>
<li>Function <b>RegistrationForm => User</b> is defined for every "proper" form but not defined for a form which has value "aaaaa" in <i>age</i> field.</li>
<li>Function <b>File=>String</b> is defined for a set of file which actually exist under expected location </li>
<li>Function <b>OrderId=>Order</b> is defined only for existing orders </li>
<li>Function <b>RequestToExternalAPI => RESPONSE</b> is of course partial function defined for cases when we actually receive any response</li>
</ul>
</p>
<p>
Dear reader - I don't know if you came to definition of error from this perspective. During my "Java Period" I treated partial functions like total functions by cheating with exceptions. Sometimes we control exceptions by throwing them but sometimes exceptions control us by throwing themselves by surprise (maybe breaking 1000000 dollar purchase process). So beside most obvious definition that <b>a bug</b> is a situation when program returns incorrect result we can add another one :
</p>
<div class="cytat" style="font-size: 1.3em;">
Error - situation when partial function is used as a total function.
</div>
<p>
So something like this :
<pre>
readGeometry:String=>Boolean = parseLine andThen isRectangularTriangle
</pre>
</p>
<p>
Can be defined as a <b>bug</b> because <i>parseLine</i> function is defined only for a specially formatted line which is a subset of all possible text lines. This solution can even pass all tests if test data is very naive.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRKL9PNis3DKssCqsFbfMZ7ECqEeqsTK-TD9LCfsCtdTfW8ioIOiOHz9AXKizJ73qnRR3NDaKG5wqlgJP2c1-25xYNW_xDwcOVjHAsxdNcgB8Ba8rjU2y9jlcJEDuYLZGmMYEXDly9DIIp/s1600/partialfunction.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhRKL9PNis3DKssCqsFbfMZ7ECqEeqsTK-TD9LCfsCtdTfW8ioIOiOHz9AXKizJ73qnRR3NDaKG5wqlgJP2c1-25xYNW_xDwcOVjHAsxdNcgB8Ba8rjU2y9jlcJEDuYLZGmMYEXDly9DIIp/s640/partialfunction.png" width="640" height="253" data-original-width="860" data-original-height="340" /></a></div>
<div style="font-size: 12px;font-weight: bold">
Partial Function visualization. Red represents input parameter not defined for given function - like database query which causes failure
</div>
<h2>Compositions and Effects </h2>
<p>
The general problem with exceptions is that it is difficult or even impossible to compose functions when we are <b>cheating with types</b>. and we want to compose because composition is the Holy Grail of programming. How to explain it? when we have a function <i>String=>(Int,Int)</i> which promise us that it will return <i>tuple (Int,Int)</i> but in reality it can return this tuple or who knows which exception then we can not just compose this function with another one <b>(Int,Int) => Boolean</b> because I can not be sure that I will receive <i>(int,int)</i> at input.
</p>
<p>
OK - it is very easy to find flaws but is there any solution? Yes there is! Instead of pretending that partial function is total we can actually convert partial function into total function using type manipulation. And this is (most likely) what Eugenio Moggi proposed in his work ("most likely" because I'm unable to understand full of it)
</p>
<p>
So lets start from classical solution :
</p>
<pre style="background:#eee;color:#3b3b3b"><span style="color:#069;font-weight:700">val</span> divide:(<span style="color:#ff5600">Int</span>,<span style="color:#ff5600">Int</span>)=><span style="color:#ff5600">Int</span> = (a,b) => a/b
<span style="color:#069;font-weight:700">try</span>{
divide(<span style="color:#a8017e">4</span>,<span style="color:#a8017e">0</span>)
}<span style="color:#069;font-weight:700">catch</span>{
<span style="color:#069;font-weight:700">case</span> e:ArithmeticException => logException
}
</pre>
<p>
It is important to understand here that <b>try</b> is an artificial <i>glue</i> here which connects reality to fake types.
Lets reveal true nature of a function.
</p>
<pre style="background:#eee;color:#3b3b3b"><span style="color:#069;font-weight:700">val</span> partialDivide : PartialFunction[(<span style="color:#ff5600">Int</span>,<span style="color:#ff5600">Int</span>),<span style="color:#ff5600">Int</span>] = {
<span style="color:#069;font-weight:700">case</span> (a,b) <span style="color:#069;font-weight:700">if</span> b!=<span style="color:#a8017e">0</span> => a/b
}
println(partialDivide.isDefinedAt(<span style="color:#a8017e">4</span>,<span style="color:#a8017e">0</span>)) <span style="color:#af82d4">//false</span>
</pre>
<p>
<b>Now the magic happens!</b> - we change partial function into total function by introducing different category of types
</p>
<pre style="background:#eee;color:#3b3b3b"><span style="color:#069;font-weight:700">val</span> totalDivide: ((<span style="color:#ff5600">Int</span>, <span style="color:#ff5600">Int</span>)) => Option[<span style="color:#ff5600">Int</span>] =partialDivide.lift
println(totalDivide(<span style="color:#a8017e">4</span>,<span style="color:#a8017e">0</span>)) <span style="color:#af82d4">//None</span>
println(totalDivide(<span style="color:#a8017e">4</span>,<span style="color:#a8017e">2</span>)) <span style="color:#af82d4">//Some(2)</span>
</pre>
<p>
"totalDivide" is a <b>total</b> function and for every argument it has defined result. Where division has sense result will be <b>Some(number)</b> but where it is not defined - zero - it is <b>None</b>. The question may appear - <i>"ok, but how this differs from null?"</i> . First of all we know the type of Option - it may be Option[Int] , Option[String] , Option[User] where null is just ... null. When we have defined set of type values <i>Some</i> and <i>None </i> we can transfrom those values with pattern matching and high order functions - first mechanism doesn't exists in Java and second is quite new - maybe that's why this approach is not so popular in the mainstream.
</p>
<p>
We talked a little bit about composition so how this totality can help here? There is whole theory describing how to compose those things - some popular words which can be known to you dear reader are <i>Functor</i> or <i>Monad</i> but in general you can call those "wrapper types" - <b>Effects</b>. So Option will be an effect of "lack of something" like result of division of zero, Future can be an e<i>ffect of time </i>in computation etc...
</p>
<p>
This <i>Effect</i> should follow some well defined laws which can be checked by tools like quickcheck or scalacheck - there is even ready to use <a href="http://eed3si9n.com/learning-scalaz/Functor+Laws.html">test template in scalaz</a>
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmIyz-lRF2hq4D3Qtfk5n8lhc0r_m2IBLdnbsNsu5EfJLfXaMSXhrD5b-6HiAQp9t8kOVQ9-jbunGFHB8ffS2VB56NZSqaUad4STt1QOnPafoJTqwB5ZBcI98rP916ir9Fc0K6tM-zG6Bl/s1600/funktor.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmIyz-lRF2hq4D3Qtfk5n8lhc0r_m2IBLdnbsNsu5EfJLfXaMSXhrD5b-6HiAQp9t8kOVQ9-jbunGFHB8ffS2VB56NZSqaUad4STt1QOnPafoJTqwB5ZBcI98rP916ir9Fc0K6tM-zG6Bl/s640/funktor.jpg" width="640" height="132" data-original-width="1050" data-original-height="217" /></a></div>
<p>
If our effect follow those laws we can easily compose them with pure total functions like in this example
<pre>
Option(x).map(f).map(g) is equal to Option(x).map(f andThen g)
</pre>
</p>
<h2>Summary</h2>
<p>
Lets stop here. The topic is enormous and a lot of powerful Effects composition constructs like Monad Transfrormers, Kleisli or Foldable waits around the corner but lets stop here. We started this article with thesis that everyday we work on auto-pilot mode and we don't have time or energy to contemplate things we do everyday. So instead diving into details of machinery stay on more abstract level and compare your usual approach to writing programs with philosophy of programming descibed in this article. Maybe something magical will happen...
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqdcT0f29M07C9yHHoHlO3hLO3ULNVn7S-zKV6KW6jaZvObreufqQkW6BOnJxXq6ElHSMuhKUt44CqATv_3x6JA3rPlp48eFmv6btzfT9xXpXildcm6_6PKmybLCKmkHxFzCdbLwk9AJGR/s1600/slonecznik.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqdcT0f29M07C9yHHoHlO3hLO3ULNVn7S-zKV6KW6jaZvObreufqQkW6BOnJxXq6ElHSMuhKUt44CqATv_3x6JA3rPlp48eFmv6btzfT9xXpXildcm6_6PKmybLCKmkHxFzCdbLwk9AJGR/s400/slonecznik.jpg" width="400" height="400" data-original-width="1080" data-original-height="1080" /></a></div>Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com0tag:blogger.com,1999:blog-1906126944535337964.post-28776446170714015252017-08-07T19:12:00.001+02:002017-08-07T20:43:18.670+02:00Levels of Abstraction<p class="akapit">
At a very beginning feel very different levels of abstractions in sentences: <i>"turn on the computer"</i> vs <i>"on molecular level trigger flow of electrons which cause millions of transistors and hundreds of various electrical circuits to obtain particular state"</i>
</p>
<p>
In our daily life and during our daily communication we subconsciously adapt level of details in our words so that other side can understood us with minimal mental energy and without necessity to decode many irrelevant aspects of information.
</p>
<p>
Do we do the same in our code?
</p>
<p>
Robert C. Martin in his book "Clean Code" describes easy to maintain functions/methods as :
<ul>
<li>small </li>
<li>responsible for only one thing </li>
<li><b>working on one level of abstraction</b></li>
</ul>
</p>
<p>
First two points should be quite easy to imagine (however functional languages like scala added another constraint that functions should not only be short but also <b>thin</b>). Yet the third point about levels of abstractions is... well abstract itself thus may be non intuitive.
</p>
<p>
It is always easier to demonstrate abstract concepts with use of a practical metaphor.
And what can be more practical than recipe for Scrambled eggs!
</p>
<h2>The recipe</h2>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioMA-VfWmoBOHrKStXjFDd1rNC_VmFJO5HsAZJVfcP028xm-p_UFvFul0YrNuimpaO85KrN4RhN30vegnr06_a7vJ0_aqgyJKv1qniovvclpqHQPf7DNTJJwG2DANyp4k3m6S7dChhmHE2/s1600/scrambledeggs.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEioMA-VfWmoBOHrKStXjFDd1rNC_VmFJO5HsAZJVfcP028xm-p_UFvFul0YrNuimpaO85KrN4RhN30vegnr06_a7vJ0_aqgyJKv1qniovvclpqHQPf7DNTJJwG2DANyp4k3m6S7dChhmHE2/s320/scrambledeggs.jpg" width="318" height="320" data-original-width="954" data-original-height="960" /></a></div>
<ol>
<li>Heat the butter in a small frying pan</li>
<li>Drop the melted butter in small chopped slices and finely chopped chives</li>
<li>Break the eggshell with a sharp knife and pour the contents into the pan </li>
<li>Add some salt and pepper and mix until the eggs are cut </li>
</ol>
<p>
The recipe is rather intuitive and no one should have problem with understanding it. On the level of abstraction used in the recipe you only need to understand how to handle couple <i>"elements"</i> from <b>kitchen domain</b> and our final product is ready!
</p>
<p>
However some relatively complex things are happening "beneath". Those complex things brings up some concepts not important from our food domain perspective.
</p>
<p>
If we would like to model this recipe in our code then how such code with broken level of abstraction would look like?
</p>
<h2>The bad code </h2>
<pre style="background:#fff;color:#3b3b3b"><span style="color:#21439c">add</span>(butter)
<span style="color:#21439c">while</span>(butter.isInConsistentState){
<span style="color:#21439c">for</span>(fatMolecule in butter)
<span style="color:#21439c">for</span>( atom in atomsInfatMolecule){
atom.<span style="color:#21439c">incrementEnergy</span>(heat)
}
}
<span style="color:#21439c">add</span>(slicedHam)
<span style="color:#21439c">while</span>(chive.isInNotManyPieces){
use knife to create force which will break chive internal structure
}
<span style="color:#21439c">add</span>(choppedChive)
knife.storePotentialEnergy
knife.transformPotentialEnergyIntoKineticEnergy
knife.<span style="color:#21439c">hit</span>(egg)
<span style="color:#21439c">add</span>(eggContent)
<span style="color:#21439c">add</span>(salt)
<span style="color:#21439c">for</span>(peeperSeed in <span style="color:#0053ff;font-weight:700">A</span> pinch <span style="color:#069;font-weight:700">of</span> pepper){
<span style="color:#21439c">add</span>(peeperSeed)
}
<span style="color:#069;font-weight:700">and</span> heat till it is ready
</pre>
<p>
(Proper naming of physical concepts may not be accurate but it just an illustration. Also food names are taken directly from google translate - author knows only several English words from food domain : proteins,fat, carbohydrates, chicken,rice ,broccoli)
</p>
<p>
Now imagine such recipe would be posted on some page with food recipes. People would be extremely confused. Moreover to understand some concepts it is not enough anymore to be just "good cook" - now it would be good to remember some things from physic lessons.
</p>
<p>
Now let's return to more natural one level of abstraction
</p>
<h2>One Level of Abstraction</h2>
<pre style="background:#fff;color:#3b3b3b">fryingPan.add(butter)
fryingPan.heat(butterIsMelted)
fryingPan.add(slicedHam)
fryingPan.add(choppedChive)
fryingPan.add(break(egg))
fryingPan.add(salt)
fryingPan.add(pepper)
fryingPan.heatTillready()
</pre>
<p>
This recipe is shorter, more readable and doesn't force you to know concepts from outside "food domain". You can also observe an interesting visual effect which can be used as <b>test for one level of abstraction</b> - in the second implementation there are no indentations!
</p>
<p>
So after this pseudo-example how this problem touches IT? Very often you will see something similar to this :
</p>
<pre style="background:#fff;color:#3b3b3b">function(object){
domainOperation()
internals=object.getInternals
...
<span style="color:#069;font-weight:700">while</span>(..){
<span style="color:#069;font-weight:700">for</span>{
<span style="color:#069;font-weight:700">for</span>{
<span style="color:#069;font-weight:700">for</span>{
<span style="color:#069;font-weight:700">if</span>(...){
<span style="color:#069;font-weight:700">for</span>{...}
}
}
}
}
}
domainOperation()
<span style="color:#069;font-weight:700">for</span>(i in <span style="color:#069;font-weight:700">object</span> internals) ...
}
</pre>
<div class="dygresja">
<h2>For managers</h2>
This situation cost your company real money. When levels of abstractions are mixed code analysis will be longer thus more expensive. Also very often changes will have to be introduced on couple levels raising probability of introducing new bugs so new costs again. Remember that.
</div>
<p>
This code breaks one level of abstraction and very often forces reader to jump between concepts from different domains. So next time you will see a domain concept cut by technical low level detail you should recognize that you just jumped through different layers of abstractions — maybe it’s your fault or maybe someone some times ago wrote a shitty code and said “this is technical debt, this is fine”.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNDpqbhvcRp-nF_a_3kjpIsjTAh-beC2c2CQcycT0Dc0DB1ZFCkS3KpHcNQFjedNXldxkzL2WfNqEesdwnTgT2rE06R4ogcS3wjxGxyMZjsnfmYK_8UeKlD82L34pJNe_n0eOLX9lR2VlZ/s1600/wlosiennica_morskie_oko.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNDpqbhvcRp-nF_a_3kjpIsjTAh-beC2c2CQcycT0Dc0DB1ZFCkS3KpHcNQFjedNXldxkzL2WfNqEesdwnTgT2rE06R4ogcS3wjxGxyMZjsnfmYK_8UeKlD82L34pJNe_n0eOLX9lR2VlZ/s640/wlosiennica_morskie_oko.JPG" width="640" height="480" data-original-width="1600" data-original-height="1200" /></a></div>
Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com0tag:blogger.com,1999:blog-1906126944535337964.post-26429212341099022772017-07-16T18:31:00.000+02:002017-07-16T20:12:14.956+02:00Rich in paradigms<p class="akapit">
<b>"You can have more!"</b> - this slogan can be find in many tv commercials. This thesis is quite dangerous in IT world because you not necessary want to have more lines of code which may rise probability of bug (although there are stories of some companies having number of code lines in KPIs!) but on the other hand maybe you can benefit by having/knowing more approaches to solve given problem?
</p>
<p>
Our brains are <a href="https://www.theguardian.com/science/2015/jan/18/modern-world-bad-for-brain-daniel-j-levitin-organized-mind-information-overload">Overloaded</a> and maybe that's why simple explanations are so pleasant for us. I remember there was a time when I was actually jealous that .NET developers have only one framework to learn when in Java you would feel frustrated knowing you don't understand N available tools because to "be pragmatic" you would have to justify why you chose one approach over another. Yet having rich choice between tool you had only one choice in solution domain - „everything is an object”.
</p>
<p>
This approach started loosing popularity in recent years When I had a chance to teach scala to both C# and Java developers – first group understood functions quite naturally and even monads were something familiar to them - „it's like LINQ, it's like LINQ”. And Java? "Java made functions popular by not having them". When in 2014 we posted news about Java8 on our JUG group – well, C# group invited us to their meeting to see what will be in Java in 5-10 years.
</p>
<p>
By having richer language C# devs started learning new paradigms earlier than standard Java dev but of course there is other side – the less complicated language is then you have less power but also maybe less chances to make exotic mistakes. So the question is can we "have more" good things and "have less" problems at the same time?
</p>
<h1>Golden hammer</h1>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh55q1SOTnny4-ZxthalKkKxjdh5A_s__rMzY9RYPCcP06vmglb6zzAd1isyYleiFkVw4JRLO1bl5R1wQYyWuIFSh_7DuPPOF38ioViE_C32HrGjFlaAqsDSlQDc8m3LGoY0Osl0mNH2Boh/s1600/functional-programming-taxonomy.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh55q1SOTnny4-ZxthalKkKxjdh5A_s__rMzY9RYPCcP06vmglb6zzAd1isyYleiFkVw4JRLO1bl5R1wQYyWuIFSh_7DuPPOF38ioViE_C32HrGjFlaAqsDSlQDc8m3LGoY0Osl0mNH2Boh/s400/functional-programming-taxonomy.png" width="400" height="295" data-original-width="338" data-original-height="249" /></a></div>
<p>
On the picture above you can see programming paradigms family – there are several different programming paradigms but we usually focusing on only one of them. Why?
</p>
<p>
I observed that when programmers are discussing about different approaches to code problems they often concentrate on some particular differences which only in a very specific situation shows „superiority” of their favorite solution. So (now we entering "metaphor mode" ) for example when someone claims that <b>fish is better than bird</b> because bird reveals many disadvantages 10 meters under water – then second person will give counter argument that fish is unable to catch food on trees. Yet – both fish and bird were adapted adapted to specific "context" by evolution.
</p>
<p>
Can similar adaptation occur on more conceptual level? For example we have paradigm with limitation „an object has to have a mutable state” and paradigm two - „code has to preserve referential transparency” (RT in short means – function calls need to be independent from surrounding state – this improves code readability a lot!)
</p>
<p>
Similarly to bird and fish it is easy to find examples when those conceptual limitations brings more problems than solutions. But again like with the "animal metaphor" – those approaches evolved in a specific circumstances.
</p>
<p>
And <b>limitation</b> understood as <b>specialization</b> can bring a lot of good things. Each year we organize <a href="http://gdcr.coderetreat.org/">Global Day Of Code Retreat</a> where each participant is forbidden to use some specific constructions they are used to in everyday work like „for -> inside for -> inside for ” etc. which leads them to better Object Oriented Design. OOD – because most people choose Java as heir primary language.
</p>
<p>
There is CodeRetrat and there is professional life. In life given language bring limitations like :
<ul>
<li><b>Java</b> - for last 20 years Java did Code Retreat with limitation „everything is an object” - what for 5% of programmers whom I know means good practices of OOP design – but for the rest – dozens of fors closed in „SomethingManager”. </li>
<li><b>Haskell</b> – you are limited to good (or bad) Functional Programming design. </li>
</ul>
</p>
<p>
Maybe first language wins in one specific environments with specific problems and another can solve different class of problems? <b>What if you have both classes of problem in you project?</b>
</p>
<p>
To support this thesis lets look at a very interesting quote from a very nice book : <a href="https://www.amazon.com/Concepts-Techniques-Models-Computer-Programming/dp/0262220695/">Concepts, Techniques, and Models of Computer Programming</a>
</p>
<div class="dygresja">
Adding a concept to a computation model introduces new forms of expression, making some programs simpler, but it also makes reasoning about programs
harder. For example, by adding explicit state (mutable variables) to a functional
programming model we can express the full range of object-oriented programming
techniques. However, reasoning about object-oriented programs is harder than reasoning about functional programs"
</div>
<p>
Also <i>Effective Java</i> – more than 10 year old book now – promotes immutable structures as easier to operate and understand (Still whenever performance is at stake you will most likely end up with mutable structures.)
</p>
<h1>Everything is an object?</h1>
<p class="akapit">
Lets start by looking at one interesting example of limitation which is only limitation in thinking. According to my current knowledge OOP focuses on solving problems by managing changes of mutable state. State is encapsulated in an object which is a noun . Noun from time to time is called Invoice, Customer or Money which makes easier to convince business people to this approach.
</p>
<p>
However OOP has no monopoly for nouns. In FP we can easily create new types.
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">data</span> <span style="color:#c5656b;font-weight:700">Customer</span><span style="color:#43a8ed;font-weight:700">=</span> <span style="color:#c5656b;font-weight:700">Customer</span> {firstName<span style="color:#43a8ed;font-weight:700">::</span><span style="color:#c5656b;font-weight:700">String</span>, lastName<span style="color:#43a8ed;font-weight:700">::</span><span style="color:#c5656b;font-weight:700">String</span>} <span style="color:#43a8ed;font-weight:700">deriving</span> (<span style="font-style:italic">Show</span>)
<span style="color:#43a8ed;font-weight:700">let</span> c<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#c5656b;font-weight:700">Customer</span> {firstName<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Stefan"</span>, lastName<span style="color:#43a8ed;font-weight:700">::</span> <span style="color:#049b0a">"Zakupowy"</span>}
<span style="color:#c5656b;font-weight:700">Prelude</span> <span style="color:#43a8ed;font-weight:700">></span> c
<span style="color:#c5656b;font-weight:700">Customer</span> {firstName <span style="color:#43a8ed;font-weight:700">=</span> <span style="color:#049b0a">"Stefan"</span>, lastName <span style="color:#43a8ed;font-weight:700">=</span> <span style="color:#049b0a">"Zakupowy"</span>}
</pre>
<h2>Polymorphism</h2>
<p class="akapit">
There was a time when word „polymorphism” triggered <a href="https://en.wikipedia.org/wiki/Classical_conditioning">„Pavlow reaction”</a> in my brain : „polymorphism <←--> Inheritance”. How surprised I was when I discovered there is a lot more!
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyL7fE9gVuZ0yadcGX8Jl4X6LsjfuS7tsNYEejTUL7jMt7Pj5NhR9h-Xmjnb9pNw56DLrHo_eA9vZX1FuaTycjTF0lCp08tVh8-WJRx7bO8-lqd4N9-e_0THEdG6k52EMxLSFOwqJCwmyr/s1600/polimorfizm.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiyL7fE9gVuZ0yadcGX8Jl4X6LsjfuS7tsNYEejTUL7jMt7Pj5NhR9h-Xmjnb9pNw56DLrHo_eA9vZX1FuaTycjTF0lCp08tVh8-WJRx7bO8-lqd4N9-e_0THEdG6k52EMxLSFOwqJCwmyr/s400/polimorfizm.jpg" width="400" height="171" data-original-width="241" data-original-height="103" /></a></div>
<p>
And for example this <i>"ad hoc polymorphism"</i> - it is something very exotic in java world. Lets assume we have some tomatoes, very clean and very pure domain tomatoes.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">case</span> <span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">Tomato</span><span style="color: #333333">(</span>weight<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Int</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> tomatoes<span style="color: #008800; font-weight: bold">=</span><span style="color: #BB0066; font-weight: bold">Seq</span><span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Tomato</span><span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">),</span><span style="color: #BB0066; font-weight: bold">Tomato</span><span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">4</span><span style="color: #333333">),</span><span style="color: #BB0066; font-weight: bold">Tomato</span><span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">7</span><span style="color: #333333">))</span>
</pre></div>
<p>
How quickly calculate sum of all tomatoes? You can add <i>Numeric</i> nature to tomato <b>ad hoc</b> and then just use them as numbers :
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">implicit</span> <span style="color: #008800; font-weight: bold">val</span> numericNatureOfTomato<span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">Numeric</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">]{</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> plus<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">,</span> y<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span> <span style="color: #333333">=</span> <span style="color: #BB0066; font-weight: bold">Tomato</span><span style="color: #333333">(</span>x<span style="color: #333333">.</span>weight<span style="color: #333333">+</span>y<span style="color: #333333">.</span>weight<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> fromInt<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Int</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span> <span style="color: #333333">=</span> <span style="color: #BB0066; font-weight: bold">Tomato</span><span style="color: #333333">(</span>x<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> toInt<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Int</span> <span style="color: #333333">=</span> x<span style="color: #333333">.</span>weight
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> toDouble<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Double</span> <span style="color: #333333">=</span> <span style="color: #333333">???</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> toFloat<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Float</span> <span style="color: #333333">=</span> <span style="color: #333333">???</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> negate<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span> <span style="color: #333333">=</span> <span style="color: #333333">???</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> toLong<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Long</span> <span style="color: #333333">=</span> <span style="color: #333333">???</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> times<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">,</span> y<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span> <span style="color: #333333">=</span> <span style="color: #333333">???</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> minus<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">,</span> y<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span> <span style="color: #333333">=</span> <span style="color: #333333">???</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> compare<span style="color: #333333">(</span>x<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">,</span> y<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Tomato</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Int</span> <span style="color: #333333">=</span> <span style="color: #333333">???</span>
<span style="color: #333333">}</span>
tomatoes<span style="color: #333333">.</span>sum
<span style="color: #888888">//res0: Tomato = Tomato(12)</span>
<span style="color: #888888">//def sum[B >: A](implicit num: Numeric[B]): B</span>
</pre></div>
<p>
In Java you would have to implement specific interface and add it to Tomato upfront. Then by extending/implementing you would add Numeric nature to you class. Scala implementation above may look like <i>Strategy</i> pattern at first - and here I can suggest to learn Haskell just for educational purposes to better understand those mechanisms.
</p>
<p>
And when we mentioned implementing/extending and inheritance overall - technically it is just a realization of more general mechanim called <a href="https://en.wikipedia.org/wiki/Dynamic_dispatch">Single Dispatch</a> and when you understand general context then maybe it will be easier to spot advantages and disadvantages of this mechanism - in contrast to popular example in OOP book - <i>"Cat and Dog extends Animal"</i>
</p>
<h1>Information Hiding</h1>
<p class="akapit">
Usually when we think OOP we think <i>Encapsulation</i> (Pavlov again?) - and to be clear it is very good practice - but let's stop and think for a moment (again) to understand where it came from and if it is only characteristic for OOP or maybe it is more general and we don't need OOP to have it?
</p>
<p>
And again <i>encapsulation</i> is an example of more general concept : <i>Information hiding</i> which leads to better modularisation of source code. Haskell is far from OOP but it has modules and is able to hide information in those modules. Then you have an alternate mechanisms to "object method" for accessing those private data. One of such mechanisms which is considered mainly to be FP mechanism is <i>Pattern Matching</i>
</p>
<p>
First of all be aware that <i>Data</i> not always have to hide something - an example can be Scala's <i>case class</i> which only represent some record. Still Scala because of it FP+OOP nature allows you to connects Pattern Matching with object encapsulation very nicely.
</p>
<p>
So having a module to play card game :
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">trait</span> <span style="color: #BB0066; font-weight: bold">Game</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Card</span><span style="color: #333333">]{</span>
<span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">Deck</span> <span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">private</span> <span style="color: #008800; font-weight: bold">val</span> cards<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">List</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Cart</span><span style="color: #333333">]){</span>
<span style="color: #008800; font-weight: bold">def</span> takeCard <span style="color: #008800; font-weight: bold">:</span> <span style="color: #333333">(</span><span style="color: #333399; font-weight: bold">Card</span><span style="color: #333333">,</span><span style="color: #333399; font-weight: bold">Deck</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">(</span>cards<span style="color: #333333">.</span>head<span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">Deck</span><span style="color: #333333">(</span>cards<span style="color: #333333">.</span>tail<span style="color: #008800; font-weight: bold">:_</span><span style="color: #333399; font-weight: bold">*</span><span style="color: #333333">))</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">object</span> <span style="color: #BB0066; font-weight: bold">Deck</span><span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">def</span> apply<span style="color: #333333">(</span>cards<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Card*</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span> <span style="color: #008800; font-weight: bold">new</span> <span style="color: #BB0066; font-weight: bold">Deck</span><span style="color: #333333">(</span>cards<span style="color: #333333">.</span>toList<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">def</span> unapply<span style="color: #333333">(</span>deck<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Deck</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Option</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Card</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #BB0066; font-weight: bold">Some</span><span style="color: #333333">(</span>deck<span style="color: #333333">.</span>cards<span style="color: #333333">.</span>head<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
Structure used to store cards (<b>List</b>) is invisible from the outside. Next step is to implement given game with simple String representation.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">object</span> <span style="color: #BB0066; font-weight: bold">StringsGame</span> <span style="color: #008800; font-weight: bold">extends</span> <span style="color: #BB0066; font-weight: bold">Game</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">]{</span>
<span style="color: #008800; font-weight: bold">def</span> take<span style="color: #333333">(</span>d<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Deck</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span> d <span style="color: #008800; font-weight: bold">match</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">case</span> <span style="color: #BB0066; font-weight: bold">Deck</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">"ace"</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=></span> <span style="background-color: #fff0f0">"I have ace"</span>
<span style="color: #008800; font-weight: bold">case</span> <span style="color: #008800; font-weight: bold">_</span> <span style="color: #008800; font-weight: bold">=></span> <span style="background-color: #fff0f0">"something else :("</span>
<span style="color: #333333">}</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
And now we can play
</p>
<pre style="background:#2a211c;color:#bdae9d">import StringsGame._
val deck1=Deck(<span style="color:#049b0a">"ace"</span>,<span style="color:#049b0a">"king"</span>,<span style="color:#049b0a">"ten"</span>)
val (_,deck2)=deck.takecard
play(deck1) //res2: String = <span style="color:#049b0a">"I have ace"</span>
play(deck2) //res3: String = <span style="color:#049b0a">"something else :("</span>
</pre>
<h1>Abstractions</h1>
<p>
Talking about abstractions can be very abstract itself (here it is a very good presentation about this topic <a href="https://www.youtube.com/watch?v=GqmsQeSzMdw">Constraints Liberate, Liberties Constrain — Runar Bjarnason</a>). To not "fly" to far from the topic lets focus on a good practice frequently used in Java -> programming towards abstract interface to postpone reveal of details and preserve freedom of choice. So For example very often <i>Collection</i> or <i>List</i> types are used in declaration to not reveal that we have <i>LinkedList</i>
</p>
<p>
Yet sometimes abstractions are not that obvious and to spot them you need to join known facts in a new way. Maybe there is something similar between <i>Optional</i> and <i>CompleatableFuture</i> when at first sight those mechanisms seems to be completely separated and designed for completely different things? It maybe the hardest thing to change someones "stabilized" opinions and views on what is and what is not good practice in given context to learn new approaches to a problem.
</p>
<h1> When Good is Bad? </h1>
<p class="cytat">
Till now we mainly discussed about good sides of using multiple paradigms but are there some downsides? Sometimes borrowing a concept from one paradigm and using it in different context creates a disaster and then people tends to blame only <b>a concept</b> forgetting about the context.
</p>
<p>
For example <i>Utils class </i> in OOP context very often signalizes design smell because where we expect separated encapsulated states we came across at a "bag of loosely connected methods".
</p>
<p>
On the other hand there is <i>org.apache.commons.lang.StringUtils</i> which is... indeed... very convenient to use even in "hardcore" OOP context. What went good there? IN OOP there is a metric called <b>LCOM4</b> which measure object cohesion by checking if there are some independent states within object state. Yet we can not use it for StringUtils - StringUtils doesn't have any state. StringUtils aren't also coupled with any external state and this is crucial. Custom "Utils" very often are <i>procedures</i> which operates on external state - StringUtils are independant. A pure function (without <i>function</i> keyword) in the middle of Object Oriented Paradigm.
</p>
<p>
When this difference is not understood correctly concept of "stateless class with static methods/functions" is always code smell
</p>
<h1>Hidden Paradigm</h1>
<p>
There is one special "paradigm" which is not well described in books about good practices - "paradigm" of writting f***ing unreadable code. It is a complex social phenomenon in which participate whole range of people from corporate ladder. This paradigm is like <i>a logical gate</i> in front of other paradigms. If it is present then we can not talk about correct OOP or correct FP approach because we just have "big ball of mud", "spaghetti" or just "f***ing bad code". Unless you remove this "bad paradigm" - discussion about other paradigms is just a science fiction.
</p>
<h1>Summary </h1>
<p class="akapit">
Imagine that you have two paradigms of movement : walking and running. There can be endless discussion which is better : "When you are running you have a chance to flee from angry dog, - hej! but if you walk silently then maybe dog won't notice you". This example is of course quite stupid because "moving" is a very intuitive "thing" for us and through using it everyday we gain intuition about the context.
</p>
<p>
Programming is not that intuitive for us and we tend to judge different approaches to quickly. Very often we learn one approach with our first technology ecosystem and then we tend to reject any other. Maybe it is better to first learn in depth how to "walk" and "run" and only then compare those two way of movement? This is called "being an engineer" - to know when to use given tool. (You can of course earn w ton of money by being so called <i>Technology Evangelist</i> and promoting only one technology)
</p>
<p>
In JVM world for more than a decade there was only one "politically correct" paradigm so maybe to "equalize" concepts people from other paradigms have to be more aggressive than it is needed? I hope dear reader that through this article I was able to show you that different paradigm can enrich each other. Or maybe there is just "one" paradigm which only parts we can see? There is a good quote from one smart book about programming which I think matches perfectly thesis of this text and show how different approaches are connected - <i>"(...)A function with internal memory is usually called an object(...)"</i>
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQjD9sAG3zm90hWpiqDX7f5lxYVA7yt6ujRT5bBs4IzqslclJehIEnj_ko3kxjTUytPtpF839fTIq9rJBXosz5qu11P5UJM5GXK81Vj-Wt_yG9PFRWMgiyP5lTBaye4l7FSDCqYOM6sxOu/s1600/okladkamala.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhQjD9sAG3zm90hWpiqDX7f5lxYVA7yt6ujRT5bBs4IzqslclJehIEnj_ko3kxjTUytPtpF839fTIq9rJBXosz5qu11P5UJM5GXK81Vj-Wt_yG9PFRWMgiyP5lTBaye4l7FSDCqYOM6sxOu/s640/okladkamala.jpg" width="640" height="640" data-original-width="800" data-original-height="800" /></a></div>
Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com0tag:blogger.com,1999:blog-1906126944535337964.post-37766075469321766242017-07-02T21:10:00.001+02:002017-07-02T21:10:20.765+02:00Consequences of invariance<p class="akapit">
First step is curiosity. If you are lucky enough to use at least java8 and you have <i>java.util.function.Function</i> to your disposal then maybe you saw methods (yes "methods on function") like <b>compose</b> or <b>andThen</b>. If you follow implementation you will see that in declaration you have quite broad generics
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">default <span style="color: #333333"><</span>V<span style="color: #333333">></span> <span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><</span>V<span style="color: #333333">,</span> R<span style="color: #333333">></span> compose<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><?</span> <span style="color: #008800; font-weight: bold">super</span> V<span style="color: #333333">,</span> <span style="color: #333333">?</span> <span style="color: #008800; font-weight: bold">extends</span> T<span style="color: #333333">></span> before<span style="color: #333333">)</span>
default <span style="color: #333333"><</span>V<span style="color: #333333">></span> <span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><</span>T<span style="color: #333333">,</span> V<span style="color: #333333">></span> andThen<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><?</span> <span style="color: #008800; font-weight: bold">super</span> R<span style="color: #333333">,</span> <span style="color: #333333">?</span> <span style="color: #008800; font-weight: bold">extends</span> V<span style="color: #333333">></span> after<span style="color: #333333">)</span>
</pre></div>
<p>
And the point is that we will see such declarations more and more in our daily programming. In this article I will try to prove that technically only such declarations with "? super" and "? extends" has pragmatic sense and you will have to type those signatures each time you are passing function as a parameter. We will see what is the origin of this and that there was and there is maybe more convenient alternative.
</p>
<h2 class="sekcja">When types have a type</h2>
<p class="akapit">
Long, long time ego everything in Java was literally <i>an Object</i>. If you had a list then you had a list of Objects - always Objects.It was a time when CRT monitors would burn your eyes and an application started 15 minutes just to throw <i>CastClassException</i> just after start. Because you had to predict or just guess what type you were operating on. It was time when Java did not have generics.
</p>
<p>
When Generics came to Java5 those problems become marginal. However a new class of problems appeared because now assumption "everything is an Object" started generating some strange problems
</p>
<p>
Because if <i>String</i> is and <i>Object</i> and we have a "List of Strings" so technically list of objects because everything is an object - then is this list also an Object and finally can it bee also seen as a list of objects? Unfortunately - for java list - the last one is not true.
</p>
<p>
If this would be possible you could write code like this:
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">String</span><span style="color: #333333">></span> strings<span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">LinkedList</span><span style="color: #333333"><>();</span>
<span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">Object</span><span style="color: #333333">></span> objects <span style="color: #008800; font-weight: bold">=</span> strings<span style="color: #333333">;</span>
objects<span style="color: #333333">.</span>add<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">);</span> <span style="color: #888888">//disaster!!</span>
</pre></div>
</p>
<p>
This would be disaster. But we case made a pact with compiler that we will not put anything "bad" there and write declaration like following one :
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333"><?</span> <span style="color: #008800; font-weight: bold">extends</span> <span style="color: #BB0066; font-weight: bold">Object</span><span style="color: #333333">></span> pactWithCompiler<span style="color: #008800; font-weight: bold">=</span>strings<span style="color: #333333">;</span>
</pre></div>
<p>
Could have Java choose different approach? There is a convenient alternative which we are going to see soon. Still, looking at this piece of code the list declaration seems to be ok and prevents ClassCastException. This was year 2004. Ten years pass. Java8 is born with java.util.Function in it. And it is this new mechanism where we will see flaws of Java generics design.
</p>
<h2 class="sekcja">Use Site Variance</h2>
<p>
So again let's take a look at <i>java.util.Function</i> to understand consequences of how generics are implemented in Java.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">default</span> <span style="color: #333333"><</span>V<span style="color: #333333">></span> Function<span style="color: #333333"><</span>V<span style="color: #333333">,</span> R<span style="color: #333333">></span> compose<span style="color: #333333">(</span>Function<span style="color: #333333"><?</span> <span style="color: #008800; font-weight: bold">super</span> V<span style="color: #333333">,</span> <span style="color: #333333">?</span> <span style="color: #008800; font-weight: bold">extends</span> T<span style="color: #333333">></span> before<span style="color: #333333">)</span> <span style="color: #333333">{</span>
Objects<span style="color: #333333">.</span><span style="color: #0000CC">requireNonNull</span><span style="color: #333333">(</span>before<span style="color: #333333">);</span>
<span style="color: #008800; font-weight: bold">return</span> <span style="color: #333333">(</span>V v<span style="color: #333333">)</span> <span style="color: #333333">-></span> apply<span style="color: #333333">(</span>before<span style="color: #333333">.</span><span style="color: #0000CC">apply</span><span style="color: #333333">(</span>v<span style="color: #333333">));</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
Do <i>extends</i> and <i>super</i> have to there? Now focus :) - they had to be there because ... there is no point to not put them there! If you will not put them there you will limit function signature without any justified reason.
</p>
<h3> extends </h3>
What is the difference between two following functions?
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">public</span> <span style="color: #008800; font-weight: bold">static</span> <span style="color: #333333"><</span>A<span style="color: #333333">,</span>B<span style="color: #333333">></span> Collection<span style="color: #333333"><</span>B<span style="color: #333333">></span> libraryMethod<span style="color: #333333">(</span>Collection<span style="color: #333333"><</span>A<span style="color: #333333">></span> c<span style="color: #333333">,</span> Function<span style="color: #333333"><</span>A<span style="color: #333333">,</span>B<span style="color: #333333">></span> f<span style="color: #333333">){</span>
List<span style="color: #333333"><</span>B<span style="color: #333333">></span> l <span style="color: #333333">=</span><span style="color: #008800; font-weight: bold">new</span> ArrayList<span style="color: #333333"><>();</span>
<span style="color: #008800; font-weight: bold">for</span><span style="color: #333333">(</span>A <span style="color: #997700; font-weight: bold">a:</span> c<span style="color: #333333">){</span>
l<span style="color: #333333">.</span><span style="color: #0000CC">add</span><span style="color: #333333">(</span>f<span style="color: #333333">.</span><span style="color: #0000CC">apply</span><span style="color: #333333">(</span>a<span style="color: #333333">));</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">return</span> l<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">public</span> <span style="color: #008800; font-weight: bold">static</span> <span style="color: #333333"><</span>A<span style="color: #333333">,</span>B<span style="color: #333333">></span> Collection<span style="color: #333333"><</span>B<span style="color: #333333">></span> libraryMethod2<span style="color: #333333">(</span>Collection<span style="color: #333333"><</span>A<span style="color: #333333">></span> c<span style="color: #333333">,</span> Function<span style="color: #333333"><?</span> <span style="color: #008800; font-weight: bold">super</span> A<span style="color: #333333">,?</span> <span style="color: #008800; font-weight: bold">extends</span> B<span style="color: #333333">></span> f<span style="color: #333333">){</span>
List<span style="color: #333333"><</span>B<span style="color: #333333">></span> l <span style="color: #333333">=</span><span style="color: #008800; font-weight: bold">new</span> ArrayList<span style="color: #333333"><>();</span>
<span style="color: #008800; font-weight: bold">for</span><span style="color: #333333">(</span>A <span style="color: #997700; font-weight: bold">a:</span> c<span style="color: #333333">){</span>
l<span style="color: #333333">.</span><span style="color: #0000CC">add</span><span style="color: #333333">(</span>f<span style="color: #333333">.</span><span style="color: #0000CC">apply</span><span style="color: #333333">(</span>a<span style="color: #333333">));</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">return</span> l<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
A difference occurs during invocation. Because having following class
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">User</span><span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">private</span> String name<span style="color: #333333">;</span>
<span style="color: #008800; font-weight: bold">private</span> Integer age<span style="color: #333333">;</span>
<span style="color: #008800; font-weight: bold">public</span> <span style="color: #0066BB; font-weight: bold">User</span><span style="color: #333333">(</span>String name<span style="color: #333333">,</span> Integer age<span style="color: #333333">)</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">this</span><span style="color: #333333">.</span><span style="color: #0000CC">name</span> <span style="color: #333333">=</span> name<span style="color: #333333">;</span>
<span style="color: #008800; font-weight: bold">this</span><span style="color: #333333">.</span><span style="color: #0000CC">age</span> <span style="color: #333333">=</span> age<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">public</span> String <span style="color: #0066BB; font-weight: bold">getName</span><span style="color: #333333">()</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">return</span> name<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">public</span> Integer <span style="color: #0066BB; font-weight: bold">getAge</span><span style="color: #333333">()</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">return</span> age<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
We would like to execute following computation :
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">User</span><span style="color: #333333">></span> users<span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">LinkedList</span><span style="color: #333333"><>();</span>
<span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">User</span><span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">String</span><span style="color: #333333">></span> display<span style="color: #008800; font-weight: bold">=</span>o<span style="color: #333333">-></span>o<span style="color: #333333">.</span>toString<span style="color: #333333">();</span>
<span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">Object</span><span style="color: #333333">></span> res1<span style="color: #008800; font-weight: bold">=</span>libraryMethod<span style="color: #333333">(</span>users<span style="color: #333333">,</span>display<span style="color: #333333">);</span> <span style="color: #888888">// error</span>
<span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">Object</span><span style="color: #333333">></span> res2<span style="color: #008800; font-weight: bold">=</span>libraryMethod2<span style="color: #333333">(</span>users<span style="color: #333333">,</span>display<span style="color: #333333">);</span>
</pre></div>
<p>
Unfortunately we can not do it :( Without "super" and "extends" we introduced artificial limitation to our function so that we now can not return supertype of String. To remove this limitation additional effort from our side is needed. There is no justification for this limitation. And this will be popular "pattern" whenever you want to respect subtype polimorpohism
</p>
<h3> super </h3>
<p>
I hope need for "extends" is now explained. Situation with "super" is less intuitive. Let's try with following : "a User has subclass which is his specialization".
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">SpecialUser</span> <span style="color: #008800; font-weight: bold">extends</span> <span style="color: #BB0066; font-weight: bold">User</span><span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">private</span> <span style="color: #BB0066; font-weight: bold">String</span> somethingSpecial<span style="color: #333333">;</span>
public <span style="color: #BB0066; font-weight: bold">SpecialUser</span><span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">String</span> name<span style="color: #333333">,</span> <span style="color: #BB0066; font-weight: bold">Integer</span> age<span style="color: #333333">,</span> <span style="color: #BB0066; font-weight: bold">String</span> somethingSpecial<span style="color: #333333">)</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">super</span><span style="color: #333333">(</span>name<span style="color: #333333">,</span> age<span style="color: #333333">);</span>
<span style="color: #008800; font-weight: bold">this</span><span style="color: #333333">.</span>somethingSpecial <span style="color: #008800; font-weight: bold">=</span> somethingSpecial<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
And now we have another library function, this time for filtering our objects
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">public static <span style="color: #333333"><</span>A<span style="color: #333333">></span> <span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span>A<span style="color: #333333">></span> filter1<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span>A<span style="color: #333333">></span> c<span style="color: #333333">,</span> <span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><</span>A<span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">Boolean</span><span style="color: #333333">></span> f<span style="color: #333333">){</span>
<span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333"><</span>A<span style="color: #333333">></span> l <span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">ArrayList</span><span style="color: #333333"><>();</span>
<span style="color: #008800; font-weight: bold">for</span><span style="color: #333333">(</span>A a<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">c</span><span style="color: #333333">){</span>
<span style="color: #008800; font-weight: bold">if</span><span style="color: #333333">(</span>f<span style="color: #333333">.</span>apply<span style="color: #333333">(</span>a<span style="color: #333333">))</span> l<span style="color: #333333">.</span>add<span style="color: #333333">(</span>a<span style="color: #333333">);</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">return</span> l<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
public static <span style="color: #333333"><</span>A<span style="color: #333333">></span> <span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span>A<span style="color: #333333">></span> filter2<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span>A<span style="color: #333333">></span> c<span style="color: #333333">,</span> <span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><?</span> <span style="color: #008800; font-weight: bold">super</span> A<span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">Boolean</span><span style="color: #333333">></span> f<span style="color: #333333">){</span>
<span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333"><</span>A<span style="color: #333333">></span> l <span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">ArrayList</span><span style="color: #333333"><>();</span>
<span style="color: #008800; font-weight: bold">for</span><span style="color: #333333">(</span>A a<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">c</span><span style="color: #333333">){</span>
<span style="color: #008800; font-weight: bold">if</span><span style="color: #333333">(</span>f<span style="color: #333333">.</span>apply<span style="color: #333333">(</span>a<span style="color: #333333">))</span> l<span style="color: #333333">.</span>add<span style="color: #333333">(</span>a<span style="color: #333333">);</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">return</span> l<span style="color: #333333">;</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
And again there is no reason to prohibit functions which works on subtypes because after all SpecialUser IS-A type of User.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">User</span><span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">Boolean</span><span style="color: #333333">></span> isAdult<span style="color: #008800; font-weight: bold">=</span>user<span style="color: #333333">-></span>user<span style="color: #333333">.</span>getAge<span style="color: #333333">()>=</span> <span style="color: #0000DD; font-weight: bold">18</span><span style="color: #333333">;</span>
<span style="color: #BB0066; font-weight: bold">Collection</span><span style="color: #333333"><</span><span style="color: #BB0066; font-weight: bold">SpecialUser</span><span style="color: #333333">></span> specialUsers<span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">LinkedList</span><span style="color: #333333"><>();</span>
<span style="color: #888888">// filter1(specialUsers,isAdult); //error because of no "super"</span>
filter2<span style="color: #333333">(</span>specialUsers<span style="color: #333333">,</span>isAdult<span style="color: #333333">);</span>
</pre></div>
<p>
Now if you check how for example <b>map</b> is implemented on <i>java.util.Stream</i> you will see the same pattern again :
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #333333"><</span>R<span style="color: #333333">></span> <span style="color: #BB0066; font-weight: bold">Stream</span><span style="color: #333333"><</span>R<span style="color: #333333">></span> map<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Function</span><span style="color: #333333"><?</span> <span style="color: #008800; font-weight: bold">super</span> T<span style="color: #333333">,</span> <span style="color: #333333">?</span> <span style="color: #008800; font-weight: bold">extends</span> R<span style="color: #333333">></span> mapper<span style="color: #333333">);</span>
</pre></div>
<p>
because really other declarations doesn't have much sense. But in such case would it be possible to implement generics in a different way so that programmers could type less - and what's more important introduce less bugs (what if you forget about "extends") ? Yes, it is possible and it is actually working quite well. Java approach is called <b>declaration site variance</b> and the alternative is...
</p>
<h2 class="sekcja"> Definition Site variance </h2>
<p>
In other languages - instead of writing "extends" in 1000 declaration places we can actually write it one - <b>on declaration</b> . This way we can set "nature" of given construct once and for all. Let see how it is implemented then :
</p>
<h3> C# </h3>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">interface</span> IProducer<<span style="color: #008800; font-weight: bold">out</span> T> <span style="color: #888888">// Covariant - "extends"</span>
{
T <span style="color: #0066BB; font-weight: bold">produce</span>();
}
<span style="color: #008800; font-weight: bold">interface</span> IConsumer<<span style="color: #008800; font-weight: bold">in</span> T> <span style="color: #888888">// Contravariant - "super"</span>
{
<span style="color: #008800; font-weight: bold">void</span> <span style="color: #0066BB; font-weight: bold">consume</span>(T t);
}
</pre></div>
<h3>Kotlin </h3>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">abstract</span> <span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">Source</span><<span style="color: #008800; font-weight: bold">out</span> T> {
<span style="color: #008800; font-weight: bold">abstract</span> <span style="color: #008800; font-weight: bold">fun</span> <span style="color: #0066BB; font-weight: bold">nextT</span>(): T
}
</pre></div>
<h3> Scala </h3>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">trait</span> <span style="color: #BB0066; font-weight: bold">Function1</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">-T1</span>,<span style="color: #333399; font-weight: bold">+R</span><span style="color: #333333">]</span> <span style="color: #BB0066; font-weight: bold">extends</span> <span style="color: #BB0066; font-weight: bold">AnyRef</span>
<span style="color: #888888">//now it is default behaviour of every function that it works as </span>
<span style="color: #888888">//'<super,extends>' with input and output types</span>
</pre></div>
<h2 class="sekcja"> But why?</h2>
<p>
Why java has use site variance. I don't know and I'm unable to find on google. Most likely this mechanism has a lot o sense in 2004 when it was created for mutable collections, IE had 90% market, people used tons of xml to share messages and no one thought about functions. In Scala mutable collections like Array are invariant and theoretically in this one place java gives more freedom because you can change construct nature when it is used. But it can actually raise more problems than benefits because now <b>library users - not designers - are responsible for proper declaration</b>. And when it was implemented this way in 2004 then it was also used this way in 2014 for functions - maybe this is an example of technical debt.
</p>
<p>
About few advantages and many flaws of "Use-Site Variance" you can read here -> <a href="https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Comparing_declaration-site_and_use-site_annotations.">https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)#Comparing_declaration-site_and_use-site_annotations. </a> . In general I hope this article shows clearly that declaration site variance is a lot better choice for Functions.
</p>
<h3>Links </h3>
<ol>
<li> <a href="https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)">https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)</a> </li>
<li> <a href="https://kotlinlang.org/docs/reference/generics.html#declaration-site-variance">https://kotlinlang.org/docs/reference/generics.html#declaration-site-variance</a> </li>
<li> <a href="https://schneide.wordpress.com/2015/05/11/declaration-site-and-use-site-variance-explained/">https://schneide.wordpress.com/2015/05/11/declaration-site-and-use-site-variance-explained/</a> </li>
<li> <a href="https://medium.com/byte-code/variance-in-java-and-scala-63af925d21dc#.lleehih3p">https://medium.com/byte-code/variance-in-java-and-scala-63af925d21dc#.lleehih3p</a> </li>
</ol>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvymUmDNWnMTUoIrRQVzFfBwUngpIbRLhhIoc1kkyTqo1-OgDqSgoIL91cW_99B7EsomgW63GsTJrM6ek7tU_XJmA-NI6yLN3GAOjQeRaqzoDb4z8pk3WGotgbxQWqwVOgHY_QNnJBNHaG/s1600/blogmale.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvymUmDNWnMTUoIrRQVzFfBwUngpIbRLhhIoc1kkyTqo1-OgDqSgoIL91cW_99B7EsomgW63GsTJrM6ek7tU_XJmA-NI6yLN3GAOjQeRaqzoDb4z8pk3WGotgbxQWqwVOgHY_QNnJBNHaG/s640/blogmale.jpg" width="640" height="360" data-original-width="800" data-original-height="450" /></a></div>Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com0tag:blogger.com,1999:blog-1906126944535337964.post-34137619656130089182015-12-15T00:41:00.000+01:002015-12-15T09:58:34.424+01:00Build intuition around quality metrics<p class="akapit">
When you hear that a car drives at a speed of 250km/h or that a tree has 2 meters height you can intuitively classify it as fast/slow or short/tall.
</p>
<p class="akapit">
When you hear that Cyclomatic complexity of you code is 2.7 or LCOM4 is 1.6 can you do the same? For many developers the answers is unfortunately - <b>no</b> - because in general there is no such real life intuition about programming. Many times I saw people surprised when they found four embedded ifs in their code. It looks like It appeared from nowhere.
</p>
<p class="akapit">
We are not born with internal knowledge about surrounding code. Here you have first article returned by google which claims that we are actually born with some intuition about physics : <a href="http://www.sciencedaily.com/releases/2012/01/120124113051.htm">Babies are born with 'intuitive physics' knowledge, says researcher</a>. Some knowledge about world around us was quite useful for many generations - on the other hand Computers are relatively new thing in the history of human species.
</p>
<p class="akapit">
So the idea of this exercise is to actually observe how complex code is created step by step and also to learn how some approaches helps keep value of this metric small.
Code is used only for demonstration so some bugs may appear in it.
</p>
<h1>Cyclomatic Complexity A.K.A Hadouken code </h1>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4UC52i76p5rty7zvEdMznpoYVqEC5Ddc0l80Uh-et0XBn_FdGW5gXpXwp-hyhOezyVsleJzFQJAa11iWq8acyWqG-xpQdYypcxaITN3v3g6t2P9fP7stLRZrnarK6sGK77I_yj3ZyPtey/s1600/cyclomatic.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4UC52i76p5rty7zvEdMznpoYVqEC5Ddc0l80Uh-et0XBn_FdGW5gXpXwp-hyhOezyVsleJzFQJAa11iWq8acyWqG-xpQdYypcxaITN3v3g6t2P9fP7stLRZrnarK6sGK77I_yj3ZyPtey/s400/cyclomatic.jpg" /></a></div>
<p class="akapit">
Wikipedia says that <b>cyclomatic Complexity</b> can by defined as <i>"It is a quantitative measure of the number of linearly independent paths through a program's source code." </i> but maybe more intuitive for practitioners will be fact that when CC=2 you need to write 2 tests, CC=3 3 tests - etc.
</p>
<p class="akapit">
Sounds trivial but now we are going to learn that Cyclomatic Complexity has non-linear nature which means that if you couple two pieces of code with CC=2 you may obtain code with CC>4. Let's see this on the battlefield.
</p>
<p class="akapit">
We are going to start with Java because it has maybe the best tools to measure complexity and than we will see what we can measure in Scala Code.
</p>
<h1>Battlefield </h1>
<p class="akapit">
Out experimental domain is very simple. Notice that at this stage we build just types with values. I used call those DTOs in my old Java days. Now I'd call them <i>Records</i>. Later we will see how nature of those classes will change during OOP refactoring.
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">class</span> <span style="color:#bf4f24">User</span>{
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">final</span> <span style="color:#a71d5d;font-style:italic">String</span> name;
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">final</span> <span style="color:#a71d5d;font-style:italic">Integer</span> age;
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">final</span> <span style="color:#a71d5d;font-style:italic">Gender</span> gender;
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">final</span> <span style="color:#a71d5d;font-style:italic">List<<span style="color:#a71d5d;font-style:italic">Product</span>></span> products;
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#bf4f24">User</span>(<span style="color:#a71d5d;font-style:italic">String</span> <span style="color:#234a97">name</span>, <span style="color:#a71d5d;font-style:italic">Integer</span> <span style="color:#234a97">age</span>, <span style="color:#a71d5d;font-style:italic">Gender</span> <span style="color:#234a97">gender</span>, <span style="color:#a71d5d;font-style:italic">List<<span style="color:#a71d5d;font-style:italic">Product</span>></span> <span style="color:#234a97">products</span>) {
<span style="color:#234a97">this</span><span style="color:#794938">.</span>name <span style="color:#794938">=</span> name;
<span style="color:#234a97">this</span><span style="color:#794938">.</span>age <span style="color:#794938">=</span> age;
<span style="color:#234a97">this</span><span style="color:#794938">.</span>gender <span style="color:#794938">=</span> gender;
<span style="color:#234a97">this</span><span style="color:#794938">.</span>products <span style="color:#794938">=</span> products;
}
}
<span style="color:#a71d5d;font-style:italic">enum</span> <span style="color:#bf4f24">Gender</span>{<span style="color:#811f24;font-weight:700">MALE</span>,<span style="color:#811f24;font-weight:700">FEMALE</span>}
<span style="color:#a71d5d;font-style:italic">class</span> <span style="color:#bf4f24">Product</span>{
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">final</span> <span style="color:#a71d5d;font-style:italic">String</span> name;
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">final</span> <span style="color:#a71d5d;font-style:italic">Category</span> category;
<span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#bf4f24">Product</span>(<span style="color:#a71d5d;font-style:italic">String</span> <span style="color:#234a97">name</span>, <span style="color:#a71d5d;font-style:italic">Category</span> <span style="color:#234a97">category</span>) {
<span style="color:#234a97">this</span><span style="color:#794938">.</span>name <span style="color:#794938">=</span> name;
<span style="color:#234a97">this</span><span style="color:#794938">.</span>category <span style="color:#794938">=</span> category;
}
}
<span style="color:#a71d5d;font-style:italic">enum</span> <span style="color:#bf4f24">Category</span>{<span style="color:#811f24;font-weight:700">FITNESS</span>,<span style="color:#811f24;font-weight:700">COMPUTER</span>,<span style="color:#811f24;font-weight:700">ADULT</span>}
</pre>
<h2>Laboratory</h2>
<p class="akapit">
We are going to develop and measure a piece of code which simulates transformation of business object into it's text representation. It's more than enough for this experiment so let's see the first version.
</p>
<pre style="background:#f9f9f9;color:#080808"> <span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">String</span> complexMethod(<span style="color:#a71d5d;font-style:italic">User</span> u){
<span style="color:#a71d5d;font-style:italic">String</span> value<span style="color:#794938">=</span><span style="color:#0b6125">""</span>;
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>age <span style="color:#794938">></span> <span style="color:#811f24;font-weight:700">18</span>){
value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult : "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}<span style="color:#794938">else</span>{
value<span style="color:#794938">=</span><span style="color:#0b6125">"Child : "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
<span style="color:#794938">return</span> value;
}
</pre>
<p>
We can easily measure CC of this code and to do this we are going to use <b>Sonarqube 4.5.6</b>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEih4-jhVpChVLcPE9NXbW5V_PoOa2BR1lFc1DEU-bIiJd-cQMCb0E2NtrNuuN7wwtrCRKQCpaqd2vYkN6-_ga2Bb08FoSzY50JqBSBUcWiYdFG5mPhkSgXi0Bu1cCPoAdnIts1XetVYw9dc/s1600/cc2sonar.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEih4-jhVpChVLcPE9NXbW5V_PoOa2BR1lFc1DEU-bIiJd-cQMCb0E2NtrNuuN7wwtrCRKQCpaqd2vYkN6-_ga2Bb08FoSzY50JqBSBUcWiYdFG5mPhkSgXi0Bu1cCPoAdnIts1XetVYw9dc/s400/cc2sonar.jpg" /></a></div>
</br><br/>And also "Metrics Reloaded" plugin will be usable in some places. </br></br>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtRcbLRr0CTwKmU5pH7rF_sDwAQ-7CdIX154aOsI50BxV5-M4pDbsPzLqfPHEQnXkCi2t_ZXSWwJt9IHB5C4aKVzqVFBzqNdBkDNqwfiCUc7AEtswhante_UJ9JU-lKh9IgTLKhOZMHNU3/s1600/cc2intellij.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtRcbLRr0CTwKmU5pH7rF_sDwAQ-7CdIX154aOsI50BxV5-M4pDbsPzLqfPHEQnXkCi2t_ZXSWwJt9IHB5C4aKVzqVFBzqNdBkDNqwfiCUc7AEtswhante_UJ9JU-lKh9IgTLKhOZMHNU3/s640/cc2intellij.jpg" /></a></div>
</p>
<p>
So after the first measurement we receive<b>CC=2</b> - we have just <b>one if statement</b> so we need two tests.
</p>
<h2>CC=4</h2>
<p>
Now let's add another conditional which execute different actions according to Gender property.
</p>
<pre style="background:#f9f9f9;color:#080808"> <span style="color:#794938">if</span>(u<span style="color:#794938">.</span>age <span style="color:#794938">></span> <span style="color:#811f24;font-weight:700">18</span>){
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}<span style="color:#794938">else</span>{
value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}<span style="color:#794938">else</span>{
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
value<span style="color:#794938">=</span><span style="color:#0b6125">"Child Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}<span style="color:#794938">else</span>{
value<span style="color:#794938">=</span><span style="color:#0b6125">"Child Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}
</pre>
<p>
Now let's try to guess what is CC of this code. Following path of execution are possible.
<ol>
<li>IF -> IF</li>
<li>IF -> ELSE</li>
<li>ELSE -> IF</li>
<li>ELSE -> ELSE</li>
</ol>
</p>
<p>
Sonar agrees.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXGAWE3NWkEfbc9s_ZMTK7XEDVeaM8CpH1JHNkwXPv-yoIpha_VBhYA0v-1Le9WFRZknKWVY5fAQBF_U1T8SUgKXhuW978PqZJU-3KtuFNdbxTOOC6On78i-keVD2ifTSMfqAH2xqZBS8M/s1600/cc4sonar.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjXGAWE3NWkEfbc9s_ZMTK7XEDVeaM8CpH1JHNkwXPv-yoIpha_VBhYA0v-1Le9WFRZknKWVY5fAQBF_U1T8SUgKXhuW978PqZJU-3KtuFNdbxTOOC6On78i-keVD2ifTSMfqAH2xqZBS8M/s400/cc4sonar.jpg" /></a></div>
<h2>CC=5</h2>
<p class="akapit">
Our logic is expanding. Now we need to also add information about products...but only for User who is Adult Male.
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">String</span> complexMethod(<span style="color:#a71d5d;font-style:italic">User</span> u){
<span style="color:#a71d5d;font-style:italic">String</span> value<span style="color:#794938">=</span><span style="color:#0b6125">""</span>;
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>age <span style="color:#794938">></span> <span style="color:#811f24;font-weight:700">18</span>){
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
<span style="color:#794938">for</span> (<span style="color:#a71d5d;font-style:italic">Product</span> p<span style="color:#794938">:</span> u<span style="color:#794938">.</span>products) {
value<span style="color:#794938">+=</span><span style="color:#0b6125">"has product "</span><span style="color:#794938">+</span>p<span style="color:#794938">.</span>name<span style="color:#794938">+</span><span style="color:#0b6125">","</span>;
}
}<span style="color:#794938">else</span>{
value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}<span style="color:#794938">else</span>{
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
value<span style="color:#794938">=</span><span style="color:#0b6125">"Child Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}<span style="color:#794938">else</span>{
value<span style="color:#794938">=</span><span style="color:#0b6125">"Child Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}
<span style="color:#794938">return</span> value;
}
</pre>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbTJvxwVXABDBJzos-ZbA3f_3Y3v24TvBltBQz3MiZhmEoks0Gf7kScUMZClpnakRwA0tuO-OxJCDAdkADyVCi1QSrF-yaasW7hTaWrMN2WxWz8-5CqqW-4uaR7rLToyApg7ctSkQpqR4d/s1600/cc5sonar.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbTJvxwVXABDBJzos-ZbA3f_3Y3v24TvBltBQz3MiZhmEoks0Gf7kScUMZClpnakRwA0tuO-OxJCDAdkADyVCi1QSrF-yaasW7hTaWrMN2WxWz8-5CqqW-4uaR7rLToyApg7ctSkQpqR4d/s400/cc5sonar.jpg" /></a></div>
<p class="akapit">
Technically sonar just counts number of for and ifs to calculate CC but we can understand result of <b>CC=5</b> this way :
<ul>
<li>For empty collection nothing changes : <b>CC+0</b> </li>
<li>For non empty collection we have another possible path : <b>CC+1</b> </li>
</ul>
</p>
<h2>CC=6</h2>
<p class="akapit">
Let's make things more interesting by adding filtering condition to products.
</p>
<pre style="background:#f9f9f9;color:#080808"> <span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">String</span> complexMethod(<span style="color:#a71d5d;font-style:italic">User</span> u){
<span style="color:#a71d5d;font-style:italic">String</span> value<span style="color:#794938">=</span><span style="color:#0b6125">""</span>;
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>age <span style="color:#794938">></span> <span style="color:#811f24;font-weight:700">18</span>){
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
<span style="color:#794938">for</span> (<span style="color:#a71d5d;font-style:italic">Product</span> p<span style="color:#794938">:</span> u<span style="color:#794938">.</span>products) {
<span style="color:#794938">if</span>(p<span style="color:#794938">.</span>category<span style="color:#794938">!=</span> <span style="color:#a71d5d;font-style:italic">Category</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>ADULT</span>) {
<span style="color:#5a525f;font-style:italic">/*HAAADUUUKEN ~~~~@ */</span> value <span style="color:#794938">+=</span> <span style="color:#0b6125">"has product "</span> <span style="color:#794938">+</span> p<span style="color:#794938">.</span>name <span style="color:#794938">+</span> <span style="color:#0b6125">","</span>;
}
}
}<span style="color:#794938">else</span>{
value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}<span style="color:#794938">else</span>{
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
value<span style="color:#794938">=</span><span style="color:#0b6125">"Child Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}<span style="color:#794938">else</span>{
value<span style="color:#794938">=</span><span style="color:#0b6125">"Child Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}
<span style="color:#794938">return</span> value;
}
</pre>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxtfzgN71P4qszzl9XelokaXx5ikEb1P_lI9N6nlXIbruoKpWXCTXD5szE1O1RUB3jKlPL8NFES66ZWz3cfDSpBOd72c28Yq8M8zPFTlzfDvwb_4Twoh2Dlg9ur-P8exOlEt-OUbGVZmnn/s1600/haduuken.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhxtfzgN71P4qszzl9XelokaXx5ikEb1P_lI9N6nlXIbruoKpWXCTXD5szE1O1RUB3jKlPL8NFES66ZWz3cfDSpBOd72c28Yq8M8zPFTlzfDvwb_4Twoh2Dlg9ur-P8exOlEt-OUbGVZmnn/s400/haduuken.jpg" /></a></div>
<p class="akapit">
We have another single if in our code. Cyclomatic complexity is <b>CC=6</b> which may seem small but it isn't. We already have <a href="http://c2.com/cgi/wiki?ArrowAntiPattern">Arrow Code Anti pattern.</a> and context awareness of this piece of (sh...) code makes it almost impossible to reuse anywhere else.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzEtr5Yezrm0DnR0OHmYlLNo8AKH6rUq0f00n_JwoockQ9yIf-PB2WWFFs_CPy1skh-g4KDmXK-tZ4ildf75jThmvbt3J_Et91nFVy2sgssfpFxdAnINE7B6eZ-SOgTHqcuKlF-7FplNaz/s1600/cc6sonar.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzEtr5Yezrm0DnR0OHmYlLNo8AKH6rUq0f00n_JwoockQ9yIf-PB2WWFFs_CPy1skh-g4KDmXK-tZ4ildf75jThmvbt3J_Et91nFVy2sgssfpFxdAnINE7B6eZ-SOgTHqcuKlF-7FplNaz/s400/cc6sonar.jpg" /></a></div>
<p>
Soon we will see how this way of coding rises complexity in more exponential than linear way. But first let's look at something called <b>essential complexity</b>.
</p>
<h1>Essential Complexity </h1>
<p class="akapit">
What if we don't want to initialize mutable variable but we would like to return from withing embedded ifs?
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">public</span> <span style="color:#a71d5d;font-style:italic">String</span> complexMethod(<span style="color:#a71d5d;font-style:italic">User</span> u){
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>age <span style="color:#794938">></span> <span style="color:#811f24;font-weight:700">18</span>){
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
<span style="color:#a71d5d;font-style:italic">String</span> value<span style="color:#794938">=</span><span style="color:#0b6125">"Adult Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
<span style="color:#794938">for</span> (<span style="color:#a71d5d;font-style:italic">Product</span> p<span style="color:#794938">:</span> u<span style="color:#794938">.</span>products) {
<span style="color:#794938">if</span>(p<span style="color:#794938">.</span>category<span style="color:#794938">!=</span> <span style="color:#a71d5d;font-style:italic">Category</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>ADULT</span>) {
<span style="color:#5a525f;font-style:italic">/*HAAADUUUKEN ~~~~@ */</span> value <span style="color:#794938">+=</span> <span style="color:#0b6125">"has product "</span> <span style="color:#794938">+</span> p<span style="color:#794938">.</span>name <span style="color:#794938">+</span> <span style="color:#0b6125">","</span>;
}
}
<span style="color:#794938">return</span> value;
}<span style="color:#794938">else</span>{
<span style="color:#794938">return</span> <span style="color:#0b6125">"Adult Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}<span style="color:#794938">else</span>{
<span style="color:#794938">if</span>(u<span style="color:#794938">.</span>gender<span style="color:#794938">==</span><span style="color:#a71d5d;font-style:italic">Gender</span><span style="color:#811f24;font-weight:700"><span style="color:#794938">.</span>MALE</span>){
<span style="color:#794938">return</span> <span style="color:#0b6125">"Child Male: "</span> <span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}<span style="color:#794938">else</span>{
<span style="color:#794938">return</span> <span style="color:#0b6125">"Child Female: "</span><span style="color:#794938">+</span>u<span style="color:#794938">.</span>name;
}
}
}
</pre>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwW9yqkyMNLafA1WwP08QJeRUPsTh5oGCVdrSzhoCep6V1jnWb_sEOM2Q9pq0tX87Xewry2tLHRPD1vpsUS76oCsQNqrUzwJm3IFKRmfJqyibVr-ngRJe1qQXHUl4XdovSG2SYxJ1TZusE/s1600/sonarCC10.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwW9yqkyMNLafA1WwP08QJeRUPsTh5oGCVdrSzhoCep6V1jnWb_sEOM2Q9pq0tX87Xewry2tLHRPD1vpsUS76oCsQNqrUzwJm3IFKRmfJqyibVr-ngRJe1qQXHUl4XdovSG2SYxJ1TZusE/s400/sonarCC10.jpg" /></a></div>
<p class="akapit">
Now Sonar will return <b>CC=10</b> because technically complexity in sonar is not just Cyclomatic Complexity but CC + Essential Complexity (and maybe + something else). Here we will receive better measurement with metrics plugin.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAedboczz_-rcDcQutMj7L4KKIfZOhUZAAa4P5k2UouBJZ0juDqE7rfxAXcA8bQ4hClqduwLyVTpNnGcyKxc6diQceO4yrtopd3rgVoIVpJDNh26RL29voWGXDYRSV2gGRdX5SFxNcyoty/s1600/cc10intelij.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiAedboczz_-rcDcQutMj7L4KKIfZOhUZAAa4P5k2UouBJZ0juDqE7rfxAXcA8bQ4hClqduwLyVTpNnGcyKxc6diQceO4yrtopd3rgVoIVpJDNh26RL29voWGXDYRSV2gGRdX5SFxNcyoty/s640/cc10intelij.jpg" /></a></div>
<p class="akapit">
So <b>Complexity=CC+EC=6+4=10</b> . <a href="https://en.wikipedia.org/wiki/Essential_complexity_(numerical_measure_of_%22structuredness%22)">Essential Complexity</a> - in my understanding it measures number of places where your logic execution can end. So because we have multiple returns it makes code more difficult to analyze. If this statements is correct is a matter of discussion but generally since I started learning Functional Programming and thinking in expressions I believe I haven't used such construction like return in the middle of the code. (BTW now I'm going to change syntax color to have a comparison what is better)
</p>
<p>
Ok we see the problem , now let's find a cure.
</p>
<h1>Refactoring OOP way</h1>
<p class="akapit">
The easiest thing at the beginning is to move logic responsible for displaying product to a dedicated component. It's not easy to think about proper domain abstractions where actually we don't have any domain problem but let's try with something generic.
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">interface</span> <span style="text-decoration:underline">ProductPolicy</span>{
<span style="color:#43a8ed;font-weight:700">boolean</span> <span style="color:#ff9358;font-weight:700">isCensored</span>(<span style="color:#43a8ed;font-weight:700">Category</span> <span style="font-style:italic">category</span>);
}
<span style="color:#43a8ed;font-weight:700">interface</span> <span style="text-decoration:underline">PolicyFactory</span>{
<span style="color:#43a8ed;font-weight:700">ProductPolicy</span> <span style="color:#ff9358;font-weight:700">create</span>(<span style="color:#43a8ed;font-weight:700">Collection<<span style="color:#43a8ed;font-weight:700">Category</span>></span> <span style="font-style:italic">forbiddenCategories</span>);
}
<span style="color:#43a8ed;font-weight:700">interface</span> <span style="text-decoration:underline">ProductDisplayer</span>{
<span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#ff9358;font-weight:700">display</span>(<span style="color:#43a8ed;font-weight:700">Collection<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="font-style:italic">products</span>);
}
</pre>
<p class="akapit">
So there is a policy which may be configured through factory. And we have interface for our displayer which may look like this:
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">class</span> <span style="text-decoration:underline">CensoredDisplayer</span> <span style="color:#43a8ed;font-weight:700">implements</span> <span style="font-style:italic">ProductDisplayer</span>{
<span style="color:#43a8ed;font-weight:700">private</span> <span style="color:#43a8ed;font-weight:700">ProductPolicy</span> productPolicy;
<span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#ff9358;font-weight:700">CensoredDisplayer</span>(<span style="color:#43a8ed;font-weight:700">ProductPolicy</span> <span style="font-style:italic">productPolicy</span>) {
<span style="color:#318495">this</span><span style="color:#43a8ed;font-weight:700">.</span>productPolicy <span style="color:#43a8ed;font-weight:700">=</span> productPolicy;
}
<span style="color:#43a8ed;font-weight:700">@Override</span>
<span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#ff9358;font-weight:700">display</span>(<span style="color:#43a8ed;font-weight:700">Collection<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="font-style:italic">products</span>) {
<span style="color:#43a8ed;font-weight:700">String</span> result<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">""</span>;
<span style="color:#43a8ed;font-weight:700">for</span> (<span style="color:#43a8ed;font-weight:700">Product</span> p<span style="color:#43a8ed;font-weight:700">:</span> products) {
result<span style="color:#43a8ed;font-weight:700">+=</span>addToDisplay(p);
}
<span style="color:#43a8ed;font-weight:700">return</span> result;
}
<span style="color:#43a8ed;font-weight:700">private</span> <span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#ff9358;font-weight:700">addToDisplay</span>(<span style="color:#43a8ed;font-weight:700">Product</span> <span style="font-style:italic">p</span>){
<span style="color:#43a8ed;font-weight:700">return</span> productPolicy<span style="color:#43a8ed;font-weight:700">.</span>isCensored(p<span style="color:#43a8ed;font-weight:700">.</span>category)<span style="color:#43a8ed;font-weight:700">?</span> <span style="color:#049b0a">""</span> <span style="color:#43a8ed;font-weight:700">:</span> <span style="color:#049b0a">" has product "</span><span style="color:#43a8ed;font-weight:700">+</span>p<span style="color:#43a8ed;font-weight:700">.</span>name<span style="color:#43a8ed;font-weight:700">+</span><span style="color:#049b0a">","</span>;
}
}
</pre>
<p class="akapit">
Now let's take a look at our laboratory code.
</p>
<pre style="background:#2a211c;color:#bdae9d"> <span style="color:#43a8ed;font-weight:700">private</span> <span style="color:#43a8ed;font-weight:700">ProductDisplayer</span> productDisplayer;
<span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#43a8ed;font-weight:700">String</span> complexMethod(<span style="color:#43a8ed;font-weight:700">User</span> u){
<span style="color:#43a8ed;font-weight:700">String</span> value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">""</span>;
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>age <span style="color:#43a8ed;font-weight:700">></span> <span style="color:#44aa43">18</span>){
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>gender<span style="color:#43a8ed;font-weight:700">==</span><span style="color:#43a8ed;font-weight:700">Gender</span><span style="color:#c5656b;font-weight:700"><span style="color:#43a8ed;font-weight:700">.</span>MALE</span>){
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Adult Male: "</span> <span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
value<span style="color:#43a8ed;font-weight:700">+=</span> productDisplayer<span style="color:#43a8ed;font-weight:700">.</span>display(u<span style="color:#43a8ed;font-weight:700">.</span>products);
}<span style="color:#43a8ed;font-weight:700">else</span>{
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Adult Female: "</span><span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
}
}<span style="color:#43a8ed;font-weight:700">else</span>{
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>gender<span style="color:#43a8ed;font-weight:700">==</span><span style="color:#43a8ed;font-weight:700">Gender</span><span style="color:#c5656b;font-weight:700"><span style="color:#43a8ed;font-weight:700">.</span>MALE</span>){
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Child Male: "</span> <span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
}<span style="color:#43a8ed;font-weight:700">else</span>{
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Child Female: "</span><span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
}
}
<span style="color:#43a8ed;font-weight:700">return</span> value;
}
</pre>
<p>
Complexity of this code is <b>CC=4</b> and complexity of Displayer is <b>CC=1.7</b> so technically whole system is a little bit less complex already. (And "Beans" is the name of a class where I put all interfaces)
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLgSPT3csbvTCYqIl2kbliJrjYZyalylB43Y7-QnC8NlExzR5yIuCZ6mbq8wIsGZVFNJQwJtGHfuN73csQ6kf843-wz27g8WpXQ34EiVDzdb3T060e0o17rtZ0Blf2eyx_fPOrTFVHn9QU/s1600/refactoringOOP1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiLgSPT3csbvTCYqIl2kbliJrjYZyalylB43Y7-QnC8NlExzR5yIuCZ6mbq8wIsGZVFNJQwJtGHfuN73csQ6kf843-wz27g8WpXQ34EiVDzdb3T060e0o17rtZ0Blf2eyx_fPOrTFVHn9QU/s640/refactoringOOP1.jpg" /></a></div>
<h2>OOP data types</h2>
<p>
To move further we can change nature of data types from records into richer entities and use inheritance polimorphism to dispatch execution between specific pieces of code.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpmQeTwb_4eqkxiMU9xskIf6gL6pLtuwalSp_qi1wo83kqMmuB0douvspHldhb2e3RXpTZIxYtCaNvJCKoNu70O2b2VuXum4nXPfQaCvsI5ltgz1_gGXO81cGY2cXwe2BGCRLtSawz_iLi/s1600/classesOOP.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhpmQeTwb_4eqkxiMU9xskIf6gL6pLtuwalSp_qi1wo83kqMmuB0douvspHldhb2e3RXpTZIxYtCaNvJCKoNu70O2b2VuXum4nXPfQaCvsI5ltgz1_gGXO81cGY2cXwe2BGCRLtSawz_iLi/s400/classesOOP.png" /></a></div>
<p>
Check the code.
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">abstract</span> <span style="color:#43a8ed;font-weight:700">class</span> <span style="text-decoration:underline">User</span>{
<span style="color:#43a8ed;font-weight:700">protected</span> <span style="color:#43a8ed;font-weight:700">final</span> <span style="color:#43a8ed;font-weight:700">String</span> name;
<span style="color:#43a8ed;font-weight:700">protected</span> <span style="color:#43a8ed;font-weight:700">Integer</span> age;
<span style="color:#43a8ed;font-weight:700">protected</span> <span style="color:#43a8ed;font-weight:700">final</span> <span style="color:#43a8ed;font-weight:700">List<<span style="color:#43a8ed;font-weight:700">Product</span>></span> products;
<span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#ff9358;font-weight:700">User</span>(<span style="color:#43a8ed;font-weight:700">String</span> <span style="font-style:italic">name</span>, <span style="color:#43a8ed;font-weight:700">Integer</span> <span style="font-style:italic">age</span>, <span style="color:#43a8ed;font-weight:700">List<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="font-style:italic">products</span>) {
<span style="color:#318495">this</span><span style="color:#43a8ed;font-weight:700">.</span>name <span style="color:#43a8ed;font-weight:700">=</span> name;
<span style="color:#318495">this</span><span style="color:#43a8ed;font-weight:700">.</span>age <span style="color:#43a8ed;font-weight:700">=</span> age;
<span style="color:#318495">this</span><span style="color:#43a8ed;font-weight:700">.</span>products <span style="color:#43a8ed;font-weight:700">=</span> products;
}
<span style="color:#43a8ed;font-weight:700">abstract</span> <span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#ff9358;font-weight:700">introduceYourself</span>();
<span style="color:#43a8ed;font-weight:700">abstract</span> <span style="color:#43a8ed;font-weight:700">List<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="color:#ff9358;font-weight:700">showProducts</span>();
}
<span style="color:#43a8ed;font-weight:700">class</span> <span style="text-decoration:underline">MaleUser</span> <span style="color:#43a8ed;font-weight:700">extends</span> <span style="font-style:italic">User</span>{
<span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#ff9358;font-weight:700">MaleUser</span>(<span style="color:#43a8ed;font-weight:700">String</span> <span style="font-style:italic">name</span>, <span style="color:#43a8ed;font-weight:700">Integer</span> <span style="font-style:italic">age</span>, <span style="color:#43a8ed;font-weight:700">List<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="font-style:italic">products</span>) {
<span style="color:#318495">super</span>(name, age, products);
}
<span style="color:#43a8ed;font-weight:700">@Override</span>
<span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#ff9358;font-weight:700">introduceYourself</span>() {
<span style="color:#43a8ed;font-weight:700">return</span> (age<span style="color:#43a8ed;font-weight:700">></span><span style="color:#44aa43">18</span><span style="color:#43a8ed;font-weight:700">?</span><span style="color:#049b0a">"ADULT MALE"</span><span style="color:#43a8ed;font-weight:700">:</span><span style="color:#049b0a">"CHILD MALE"</span>) <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" : "</span><span style="color:#43a8ed;font-weight:700">+</span>name;
}
<span style="color:#43a8ed;font-weight:700">@Override</span>
<span style="color:#43a8ed;font-weight:700">List<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="color:#ff9358;font-weight:700">showProducts</span>() {
<span style="color:#43a8ed;font-weight:700">return</span> age<span style="color:#43a8ed;font-weight:700">></span><span style="color:#44aa43">18</span><span style="color:#43a8ed;font-weight:700">?</span> products<span style="color:#43a8ed;font-weight:700">:</span> <span style="color:#43a8ed;font-weight:700">new</span> <span style="color:#43a8ed;font-weight:700">LinkedList<></span>();
}
}
<span style="color:#43a8ed;font-weight:700">class</span> <span style="text-decoration:underline">FemaleUser</span> <span style="color:#43a8ed;font-weight:700">extends</span> <span style="font-style:italic">User</span>{
<span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#ff9358;font-weight:700">FemaleUser</span>(<span style="color:#43a8ed;font-weight:700">String</span> <span style="font-style:italic">name</span>, <span style="color:#43a8ed;font-weight:700">Integer</span> <span style="font-style:italic">age</span>, <span style="color:#43a8ed;font-weight:700">List<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="font-style:italic">products</span>) {
<span style="color:#318495">super</span>(name, age, products);
}
<span style="color:#43a8ed;font-weight:700">@Override</span>
<span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#ff9358;font-weight:700">introduceYourself</span>() {
<span style="color:#43a8ed;font-weight:700">return</span> (age<span style="color:#43a8ed;font-weight:700">></span><span style="color:#44aa43">18</span><span style="color:#43a8ed;font-weight:700">?</span><span style="color:#049b0a">"ADULT FEMALE"</span><span style="color:#43a8ed;font-weight:700">:</span><span style="color:#049b0a">"CHILD FEMALE"</span>)<span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" : "</span><span style="color:#43a8ed;font-weight:700">+</span>name;
}
<span style="color:#43a8ed;font-weight:700">@Override</span>
<span style="color:#43a8ed;font-weight:700">List<<span style="color:#43a8ed;font-weight:700">Product</span>></span> <span style="color:#ff9358;font-weight:700">showProducts</span>() {
<span style="color:#43a8ed;font-weight:700">return</span> <span style="color:#43a8ed;font-weight:700">new</span> <span style="color:#43a8ed;font-weight:700">LinkedList<></span>();
}
}
</pre>
<p>
What is important here is to notice how context information was now moved to children of (now) an abstract class <b>User</b>. Context information is now encapsulated inside classes and this construction reduces Cyclomatic Complexity in the place where objects are used. Look at this:
</p>
<pre style="background:#2a211c;color:#bdae9d"> <span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#43a8ed;font-weight:700">String</span> complexMethod(<span style="color:#43a8ed;font-weight:700">User</span> u){
<span style="color:#43a8ed;font-weight:700">String</span> result<span style="color:#43a8ed;font-weight:700">=</span>u<span style="color:#43a8ed;font-weight:700">.</span>introduceYourself();
result<span style="color:#43a8ed;font-weight:700">+=</span>productDisplayer<span style="color:#43a8ed;font-weight:700">.</span>display(u<span style="color:#43a8ed;font-weight:700">.</span>showProducts());
<span style="color:#43a8ed;font-weight:700">return</span> result;
}
</pre>
<p>
So once again - we take control over context awareness of particular logic by moving this logic inside classes. Execution is controlled by polymorphic method dispatch.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-HVY8HdZtiqnK0VFDTO1IiYThiD9HFik0NyJn8dBldw2W05IjGaS1w0zbrV4QOCn2tI_LhnM2DHpoNNvcQu4pFaw2oaJBhIdLJS4OolncQJCTlMXkShyphenhyphenNo0kWQHRBFDcu6fyDgJUJlKUa/s1600/cc1sonar.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-HVY8HdZtiqnK0VFDTO1IiYThiD9HFik0NyJn8dBldw2W05IjGaS1w0zbrV4QOCn2tI_LhnM2DHpoNNvcQu4pFaw2oaJBhIdLJS4OolncQJCTlMXkShyphenhyphenNo0kWQHRBFDcu6fyDgJUJlKUa/s400/cc1sonar.jpg" /></a></div>
<h1>Refactoring FP Way </h1>
<p class="akapit">
We are going to start in similar way as we did in an OOP example. So at the beginning let's move logic responsible for filtering and displaying products into dedicated functions.
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">Set<<span style="color:#43a8ed;font-weight:700">Category</span>></span>,<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">Category</span>,<span style="color:#43a8ed;font-weight:700">Boolean</span>></span>></span> policy<span style="color:#43a8ed;font-weight:700">=</span>
forbiddenCategories <span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span> category <span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span> <span style="color:#43a8ed;font-weight:700">!</span>forbiddenCategories<span style="color:#43a8ed;font-weight:700">.</span>contains(category);
<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">Category</span>,<span style="color:#43a8ed;font-weight:700">Boolean</span>></span> adultPolicy<span style="color:#43a8ed;font-weight:700">=</span>policy<span style="color:#43a8ed;font-weight:700">.</span>apply(<span style="color:#43a8ed;font-weight:700">new</span> <span style="color:#43a8ed;font-weight:700">HashSet<></span>(<span style="color:#43a8ed;font-weight:700">Arrays</span><span style="color:#43a8ed;font-weight:700">.</span>asList(<span style="color:#43a8ed;font-weight:700">Category</span><span style="color:#c5656b;font-weight:700"><span style="color:#43a8ed;font-weight:700">.</span>ADULT</span>)));
<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">Category</span>,<span style="color:#43a8ed;font-weight:700">Boolean</span>></span>,<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">Collection<<span style="color:#43a8ed;font-weight:700">Product</span>></span>,<span style="color:#43a8ed;font-weight:700">String</span>></span>></span> displayProducts<span style="color:#43a8ed;font-weight:700">=</span> policy<span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span> ps <span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span>
ps<span style="color:#43a8ed;font-weight:700">.</span>stream()
.filter(p<span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span>policy<span style="color:#43a8ed;font-weight:700">.</span>apply(p<span style="color:#43a8ed;font-weight:700">.</span>category))
.map(p<span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span><span style="color:#049b0a">" has product "</span><span style="color:#43a8ed;font-weight:700">+</span>p<span style="color:#43a8ed;font-weight:700">.</span>name)
.collect(<span style="color:#43a8ed;font-weight:700">Collectors</span><span style="color:#43a8ed;font-weight:700">.</span>joining(<span style="color:#049b0a">","</span>));
</pre>
<p>So we have <i>Policy</i>, <i>Parametrized Policy</i> and <i>Product Displayer</i>. Complexity of our lab method will be now reduced to <b>CC=4</b> </p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#43a8ed;font-weight:700">String</span> complexMethod(<span style="color:#43a8ed;font-weight:700">User</span> u){
<span style="color:#43a8ed;font-weight:700">String</span> value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">""</span>;
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>age <span style="color:#43a8ed;font-weight:700">></span> <span style="color:#44aa43">18</span>){
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>gender<span style="color:#43a8ed;font-weight:700">==</span><span style="color:#43a8ed;font-weight:700">Gender</span><span style="color:#c5656b;font-weight:700"><span style="color:#43a8ed;font-weight:700">.</span>MALE</span>){
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Adult Male: "</span> <span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
displayProducts<span style="color:#43a8ed;font-weight:700">.</span>apply(adultPolicy)<span style="color:#43a8ed;font-weight:700">.</span>apply(u<span style="color:#43a8ed;font-weight:700">.</span>products);
}<span style="color:#43a8ed;font-weight:700">else</span>{
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Adult Female: "</span><span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
}
}<span style="color:#43a8ed;font-weight:700">else</span>{
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>gender<span style="color:#43a8ed;font-weight:700">==</span><span style="color:#43a8ed;font-weight:700">Gender</span><span style="color:#c5656b;font-weight:700"><span style="color:#43a8ed;font-weight:700">.</span>MALE</span>){
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Child Male: "</span> <span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
}<span style="color:#43a8ed;font-weight:700">else</span>{
value<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">"Child Female: "</span><span style="color:#43a8ed;font-weight:700">+</span>u<span style="color:#43a8ed;font-weight:700">.</span>name;
}
}
<span style="color:#43a8ed;font-weight:700">return</span> value;
}
</pre>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjor68ZP7e4r2LyNIHFicSEOKk9Wq2gcxJlt6JwffSu539fYz5yPWyaIUClePZhQPcltVZWfjAhoten_4r3aDMns7PfkdrsuVhAWW_yAVVVVZaRqhn-Hd1vR5xd7nUa8nRwg_xImb4v4WMh/s1600/fpcc4.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjor68ZP7e4r2LyNIHFicSEOKk9Wq2gcxJlt6JwffSu539fYz5yPWyaIUClePZhQPcltVZWfjAhoten_4r3aDMns7PfkdrsuVhAWW_yAVVVVZaRqhn-Hd1vR5xd7nUa8nRwg_xImb4v4WMh/s400/fpcc4.jpg" /></a></div>
<p>
Now let's introduce some conditional logic into next functions mainly to check how good is sonar in measuring CC of Functions.
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">User</span>,<span style="color:#43a8ed;font-weight:700">String</span>></span> ageLabel<span style="color:#43a8ed;font-weight:700">=</span> u <span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span> {
<span style="color:#43a8ed;font-weight:700">if</span> (u<span style="color:#43a8ed;font-weight:700">.</span>age<span style="color:#43a8ed;font-weight:700">></span><span style="color:#44aa43">18</span>)
<span style="color:#43a8ed;font-weight:700">return</span> <span style="color:#049b0a">"ADULT"</span>;
<span style="color:#43a8ed;font-weight:700">else</span>
<span style="color:#43a8ed;font-weight:700">return</span> <span style="color:#049b0a">"CHILD"</span>;
};
<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">User</span>,<span style="color:#43a8ed;font-weight:700">String</span>></span> introduce<span style="color:#43a8ed;font-weight:700">=</span>u<span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span> {
<span style="color:#43a8ed;font-weight:700">String</span> result<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">""</span>;
<span style="color:#43a8ed;font-weight:700">switch</span>(u<span style="color:#43a8ed;font-weight:700">.</span>gender) {
<span style="color:#43a8ed;font-weight:700">case</span> <span style="color:#c5656b;font-weight:700">MALE</span><span style="color:#43a8ed;font-weight:700">:</span>
result<span style="color:#43a8ed;font-weight:700">=</span> ageLabel<span style="color:#43a8ed;font-weight:700">.</span>apply(u) <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" MALE"</span> <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" : "</span> <span style="color:#43a8ed;font-weight:700">+</span> u<span style="color:#43a8ed;font-weight:700">.</span>name;
<span style="color:#43a8ed;font-weight:700">break</span>;
<span style="color:#43a8ed;font-weight:700">case</span> <span style="color:#c5656b;font-weight:700">FEMALE</span><span style="color:#43a8ed;font-weight:700">:</span> result<span style="color:#43a8ed;font-weight:700">=</span> ageLabel<span style="color:#43a8ed;font-weight:700">.</span>apply(u) <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" FEMALE"</span> <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" : "</span> <span style="color:#43a8ed;font-weight:700">+</span> u<span style="color:#43a8ed;font-weight:700">.</span>name;
}
<span style="color:#43a8ed;font-weight:700">return</span> result;
};
<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">User</span>,<span style="color:#43a8ed;font-weight:700">Collection<<span style="color:#43a8ed;font-weight:700">Product</span>></span>></span> getProducts <span style="color:#43a8ed;font-weight:700">=</span> u<span style="color:#43a8ed;font-weight:700">-</span><span style="color:#43a8ed;font-weight:700">></span> {
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>gender<span style="color:#43a8ed;font-weight:700">==</span><span style="color:#43a8ed;font-weight:700">Gender</span><span style="color:#c5656b;font-weight:700"><span style="color:#43a8ed;font-weight:700">.</span>MALE</span> <span style="color:#43a8ed;font-weight:700">&&</span> u<span style="color:#43a8ed;font-weight:700">.</span>age<span style="color:#43a8ed;font-weight:700">></span><span style="color:#44aa43">18</span> )
<span style="color:#43a8ed;font-weight:700">return</span> u<span style="color:#43a8ed;font-weight:700">.</span>products;
<span style="color:#43a8ed;font-weight:700">else</span>
<span style="color:#43a8ed;font-weight:700">return</span> <span style="color:#43a8ed;font-weight:700">new</span> <span style="color:#43a8ed;font-weight:700">ArrayList<></span>();
};
</pre>
<p>
And finally let's see how all this functional machinery can help us!
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#06f;font-style:italic">//composition!!</span>
<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">User</span>, <span style="color:#43a8ed;font-weight:700">String</span>></span> productDisplayer <span style="color:#43a8ed;font-weight:700">=</span> getProducts<span style="color:#43a8ed;font-weight:700">.</span>andThen(displayProducts<span style="color:#43a8ed;font-weight:700">.</span>apply(adultPolicy));
<span style="color:#43a8ed;font-weight:700">public</span> <span style="color:#43a8ed;font-weight:700">String</span> complexMethod(<span style="color:#43a8ed;font-weight:700">User</span> u){
<span style="color:#43a8ed;font-weight:700">return</span> introduce<span style="color:#43a8ed;font-weight:700">.</span>apply(u) <span style="color:#43a8ed;font-weight:700">+</span> productDisplayer<span style="color:#43a8ed;font-weight:700">.</span>apply(u);
}
</pre>
<p>
This looks wonderful - and now good and bed news.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5GbVvuGXJa8UsIEUypVnNPQlEhEbTbVTHzmKoFUDdGMVle50wAf6F0GWJxzYiS_ZGvd8V6xLjhbk3l1tnPNUdHWZcsiNHOe03xdwszwG15sedXrw_1_qLHpsVFOZV8vINyihUjQVjdGPe/s1600/functionalcc1.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5GbVvuGXJa8UsIEUypVnNPQlEhEbTbVTHzmKoFUDdGMVle50wAf6F0GWJxzYiS_ZGvd8V6xLjhbk3l1tnPNUdHWZcsiNHOe03xdwszwG15sedXrw_1_qLHpsVFOZV8vINyihUjQVjdGPe/s400/functionalcc1.jpg" /></a></div>
<p>
We have reduced Complexity of our lab function to <b>CC=1</b> but unfortunately Sonar is unable to measure complexity inside functions. I tried Sonar 4.X and Sonar 5.X - both without success.
The only solution I found to have proper CC measurements is to use method references.
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">static</span> <span style="color:#43a8ed;font-weight:700">String</span> introduceMethod(<span style="color:#43a8ed;font-weight:700">User</span> u){
<span style="color:#43a8ed;font-weight:700">String</span> result<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#049b0a">""</span>;
<span style="color:#43a8ed;font-weight:700">switch</span>(u<span style="color:#43a8ed;font-weight:700">.</span>gender) {
<span style="color:#43a8ed;font-weight:700">case</span> <span style="color:#c5656b;font-weight:700">MALE</span><span style="color:#43a8ed;font-weight:700">:</span>
result<span style="color:#43a8ed;font-weight:700">=</span> ageLabel<span style="color:#43a8ed;font-weight:700">.</span>apply(u) <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">"pełnoletni"</span> <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" : "</span> <span style="color:#43a8ed;font-weight:700">+</span> u<span style="color:#43a8ed;font-weight:700">.</span>name;
<span style="color:#43a8ed;font-weight:700">break</span>;
<span style="color:#43a8ed;font-weight:700">case</span> <span style="color:#c5656b;font-weight:700">FEMALE</span><span style="color:#43a8ed;font-weight:700">:</span> result<span style="color:#43a8ed;font-weight:700">=</span> ageLabel<span style="color:#43a8ed;font-weight:700">.</span>apply(u) <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">"pełnoletnia"</span> <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" : "</span> <span style="color:#43a8ed;font-weight:700">+</span> u<span style="color:#43a8ed;font-weight:700">.</span>name;
}
<span style="color:#43a8ed;font-weight:700">return</span> result;
}
<span style="color:#43a8ed;font-weight:700">Function<<span style="color:#43a8ed;font-weight:700">User</span>,<span style="color:#43a8ed;font-weight:700">String</span>></span> introduce<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">BlogComplexityFP</span><span style="color:#43a8ed;font-weight:700">:</span><span style="color:#43a8ed;font-weight:700">:</span>introduceMethod;
</pre>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmIifeKUVZugjEvork8QQ8EiQPsfamjMHuFG3w_ch7JXqfMiOmBgAQC4akxmmUsKcRqY6XUd9wbaB9GflNobjAyNEK8Ixgt40ULv7HTdycYzxcXbonLo-WMXLXLy_NOTgrbb5bUZW6H-JF/s1600/functionalCC2.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhmIifeKUVZugjEvork8QQ8EiQPsfamjMHuFG3w_ch7JXqfMiOmBgAQC4akxmmUsKcRqY6XUd9wbaB9GflNobjAyNEK8Ixgt40ULv7HTdycYzxcXbonLo-WMXLXLy_NOTgrbb5bUZW6H-JF/s400/functionalCC2.jpg" /></a></div>
<p>We saw how to chandle complexity in Java both in OOP and FP way - now let's quickly check how to measure complexity in the Scala world. </p>
<h1>Scala Measurements </h1>
<p class="akapit">
I believe that Scala main quality tool is <a href="http://www.scalastyle.org/">Scala Style</a>. It has plugin for sonar but is not as rich as the one for Java. Generally in <i>Scala Style</i> we can set acceptable Cyclomatic Complexity level and if code exceeds it then a warning will be risen.
</p>
<p>
So if I have this ugly piece of code
</p>
<pre style="background:#2a211c;color:#bdae9d"><span style="color:#43a8ed;font-weight:700">def</span> <span style="color:#ff9358;font-weight:700">complexMethod</span>(<span style="font-style:italic">u</span>: User): <span style="color:#43a8ed;font-weight:700">String</span> = {
<span style="color:#43a8ed;font-weight:700">if</span> (u.age > <span style="color:#44aa43">18</span>) {
<span style="color:#43a8ed;font-weight:700">if</span> (u.gender == MALE) {
<span style="color:#43a8ed;font-weight:700">var</span> value: <span style="color:#43a8ed;font-weight:700">String</span> = <span style="color:#049b0a">"pełnoletni : "</span> + u.name
<span style="color:#43a8ed;font-weight:700">for</span> (p <- u.products) {
<span style="color:#43a8ed;font-weight:700">if</span> (p.category != Category.ADULT) {
value += <span style="color:#049b0a">"i ma produkt "</span> + p.name + <span style="color:#049b0a">","</span>
}
}
value
}
<span style="color:#43a8ed;font-weight:700">else</span> {
<span style="color:#049b0a">"pełnoletna : "</span> + u.name
}
}
<span style="color:#43a8ed;font-weight:700">else</span> {
<span style="color:#43a8ed;font-weight:700">if</span> (u.gender eq MALE) {
<span style="color:#049b0a">"niepełnoletni : "</span> + u.name
}
<span style="color:#43a8ed;font-weight:700">else</span> {
<span style="color:#049b0a">"niepełnoletnia : "</span> + u.name
}
}
}
</pre>
<p>
Then I will only receive a warning
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIG0tRWhP-5EtFXWS9ZqaC7xw-8jLl2wlTdubHiEjKMGDqhmJ5BNq8rKNHFN70CaxhlsNITmOPFvzIjomz1iMW8rLytzpLvYZxNddjYxzIWoFQ_qZva4lXNaGFgbu3ewVx99utFxuznG_B/s1600/ScalaProcedural.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgIG0tRWhP-5EtFXWS9ZqaC7xw-8jLl2wlTdubHiEjKMGDqhmJ5BNq8rKNHFN70CaxhlsNITmOPFvzIjomz1iMW8rLytzpLvYZxNddjYxzIWoFQ_qZva4lXNaGFgbu3ewVx99utFxuznG_B/s640/ScalaProcedural.jpg" /></a></div>
<p>
More info will be displayed in the sbt console
</p>
<pre>
ProceduralExample.scala:26:6: Cyclomatic complexity of 6 exceeds max of 1
</pre>
<p>
And finally Scala code with <b>CC=1</b>
</p>
<pre style="background:#2a211c;color:#bdae9d">object <span style="color:#43a8ed;font-weight:700">FPExample</span> {
object <span style="color:#43a8ed;font-weight:700">Gender</span> extends <span style="color:#43a8ed;font-weight:700">Enumeration</span>{
type <span style="color:#43a8ed;font-weight:700">Gender</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">Value</span>
val <span style="color:#c5656b;font-weight:700">MALE</span>,<span style="color:#c5656b;font-weight:700">FEMALE</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">Value</span>
}
object <span style="color:#43a8ed;font-weight:700">Category</span> extends <span style="color:#43a8ed;font-weight:700">Enumeration</span>{
type <span style="color:#43a8ed;font-weight:700">Category</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">Value</span>
val <span style="color:#c5656b;font-weight:700">FITNESS</span>,<span style="color:#c5656b;font-weight:700">COMPUTER</span>,<span style="color:#c5656b;font-weight:700">ADULT</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">Value</span>
}
import <span style="color:#43a8ed;font-weight:700">Gender</span><span style="color:#43a8ed;font-weight:700">.</span>_
import <span style="color:#43a8ed;font-weight:700">Category</span><span style="color:#43a8ed;font-weight:700">.</span>_
<span style="color:#43a8ed;font-weight:700">case</span> class <span style="color:#43a8ed;font-weight:700">Product</span>(<span style="color:#c5656b;font-weight:700">name</span>:<span style="color:#43a8ed;font-weight:700">String</span>,<span style="color:#c5656b;font-weight:700">category</span>: <span style="color:#43a8ed;font-weight:700">Category</span>)
<span style="color:#43a8ed;font-weight:700">case</span> class <span style="color:#43a8ed;font-weight:700">User</span>(<span style="color:#c5656b;font-weight:700">name</span>:<span style="color:#43a8ed;font-weight:700">String</span>,<span style="color:#c5656b;font-weight:700">age</span>:<span style="color:#43a8ed;font-weight:700">Int</span>,<span style="color:#c5656b;font-weight:700">gender</span>:<span style="color:#43a8ed;font-weight:700">Gender</span>,<span style="color:#c5656b;font-weight:700">products</span>:<span style="color:#43a8ed;font-weight:700">List</span>[<span style="color:#43a8ed;font-weight:700">Product</span>])
val policy : <span style="color:#43a8ed;font-weight:700">Set</span>[<span style="color:#43a8ed;font-weight:700">Category</span>] <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> <span style="color:#43a8ed;font-weight:700">Category</span> <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> <span style="color:#43a8ed;font-weight:700">Boolean</span> <span style="color:#43a8ed;font-weight:700">=</span>
forbiddenCategories <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> category <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> <span style="color:#43a8ed;font-weight:700">!</span>forbiddenCategories<span style="color:#43a8ed;font-weight:700">.</span>contains(category)
val adultPolicy <span style="color:#43a8ed;font-weight:700">=</span> policy(<span style="color:#43a8ed;font-weight:700">Set</span>(<span style="color:#c5656b;font-weight:700">ADULT</span>))
<span style="color:#43a8ed;font-weight:700">def</span> displayProducts(<span style="color:#c5656b;font-weight:700">policy</span>:<span style="color:#43a8ed;font-weight:700">Category</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span><span style="color:#43a8ed;font-weight:700">Boolean</span>)(<span style="color:#c5656b;font-weight:700">ps</span>:<span style="color:#43a8ed;font-weight:700">Seq</span>[<span style="color:#43a8ed;font-weight:700">Product</span>]):<span style="color:#43a8ed;font-weight:700">String</span><span style="color:#43a8ed;font-weight:700">=</span>
ps
.filter(p<span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span>policy(p<span style="color:#43a8ed;font-weight:700">.</span>category))
.map(<span style="color:#049b0a">" and have product "</span> <span style="color:#43a8ed;font-weight:700">+</span> _<span style="color:#43a8ed;font-weight:700">.</span>name)
.mkString(<span style="color:#049b0a">","</span>)
val <span style="color:#c5656b;font-weight:700">agePrefix</span>: <span style="color:#43a8ed;font-weight:700">User</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span><span style="color:#43a8ed;font-weight:700">String</span><span style="color:#43a8ed;font-weight:700">=</span> u <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span>
<span style="color:#43a8ed;font-weight:700">if</span>(u<span style="color:#43a8ed;font-weight:700">.</span>age<span style="color:#43a8ed;font-weight:700">></span><span style="color:#44aa43">17</span>) <span style="color:#049b0a">"adult"</span> <span style="color:#43a8ed;font-weight:700">else</span> <span style="color:#049b0a">"child"</span>
val <span style="color:#c5656b;font-weight:700">introduce</span> : <span style="color:#43a8ed;font-weight:700">User</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span><span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#43a8ed;font-weight:700">=</span> _ match {
<span style="color:#43a8ed;font-weight:700">case</span> u @ <span style="color:#43a8ed;font-weight:700">User</span>(name,_,<span style="color:#c5656b;font-weight:700">MALE</span>,_) <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> agePrefix(u) <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" Male :"</span> <span style="color:#43a8ed;font-weight:700">+</span> name
<span style="color:#43a8ed;font-weight:700">case</span> u @ <span style="color:#43a8ed;font-weight:700">User</span>(name,_,<span style="color:#c5656b;font-weight:700">FEMALE</span>,_) <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> agePrefix(u) <span style="color:#43a8ed;font-weight:700">+</span> <span style="color:#049b0a">" Female :"</span> <span style="color:#43a8ed;font-weight:700">+</span> name
}
val getProducts: <span style="color:#43a8ed;font-weight:700">User</span><span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span><span style="color:#43a8ed;font-weight:700">Seq</span>[<span style="color:#43a8ed;font-weight:700">Product</span>]<span style="color:#43a8ed;font-weight:700">=</span> _ match {
<span style="color:#43a8ed;font-weight:700">case</span> <span style="color:#43a8ed;font-weight:700">User</span>(_,age,<span style="color:#c5656b;font-weight:700">MALE</span>,products) <span style="color:#43a8ed;font-weight:700">if</span> age<span style="color:#43a8ed;font-weight:700">></span><span style="color:#44aa43">18</span> <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> products
<span style="color:#43a8ed;font-weight:700">case</span> _ <span style="color:#43a8ed;font-weight:700">=</span><span style="color:#43a8ed;font-weight:700">></span> <span style="color:#43a8ed;font-weight:700">Seq</span>[<span style="color:#43a8ed;font-weight:700">Product</span>]()
}
val productDisplayer<span style="color:#43a8ed;font-weight:700">=</span>getProducts andThen displayProducts(adultPolicy)
<span style="color:#43a8ed;font-weight:700">def</span> complexMethod(<span style="color:#c5656b;font-weight:700">u</span>: <span style="color:#43a8ed;font-weight:700">User</span>): <span style="color:#43a8ed;font-weight:700">String</span> <span style="color:#43a8ed;font-weight:700">=</span> introduce(u) <span style="color:#43a8ed;font-weight:700">+</span> productDisplayer(u)
}
</pre>
<p>
Is it a complex or a simple code? It's difficult to say because <b>Cyclomatic Complexity</b> tends to be not very helpful when lambdas enter the scene. In a couple paragraph below we will think about possible solution.
</p>
<h1>Non Linear Nature of Complexity</h1>
<p class="akapit">
Ok after all those examples and exercises let's try to understand non linear nature of Complexity. We saw that when we are moving logic to external components or functions - we can make them context unaware to some degree. Then they are receiving information about context they are being used in by various types of parametrization.
</p>
<p class="akapit">
So if for example if I have a <i>Component A with CC=2</i> and I want to use this component in some <i>Component B</i> then this action does not raise CC of <i>Component B</i>. Of course I assume that <i>Component A</i> was properly tested and it behaves in predictable way
</p>
<p class="akapit">
Now when I would take logic our from <i>Component A</i> and paste it directly into <i>Component B</i> then <b>most likely</b> I can not use this piece of code in <i>Component C</i> because
it is already embedded in <i>Context B</i>. This way I <b>raised CC by 4</b> with the same piece of code.
</p>
<p class="akapit">
If it is not clear yet then let's return to this part :
<pre>
if(age>18){
if(MALE){...} //1
else{...}
}else{
if(MALE){...} //2
else{...}
}
</pre>
Both ifs //1 and //2 are similar but they are aware than one is in the context of age>18 and second one in age<=18. So every occurence of similar code will raise CC by 2.
</p>
<h2>OOP & FP </h2>
<p class="akapit">
I hope I also managed to show that proper usage of both OOP or FP mechanism reduces complexity. <b>According to my observation programmers generate complexity mainly not by using wrong paradigm but by using any paradigm in a wrong way</b> - in shorter words - first learn OOP or FP properly and only then discuss FP vs OOP (and of course I should do this too).
</p>
<h1>Psychological Aspect</h1>
<p class="akapit">
Till now we discussed how code complexity influence code itself but can we actually check how this complexity affects developers?
</p>
<p class="akapit">
Actually there was an interesting experiment more than half century ago - an experiment which described borders of humans' cognitive limits.
Experiment description can be found here : <a href="https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus_or_Minus_Two">The magic number seven plus/minus two</a>. According to researcher - George A. Miller - <b>Human being can process 7 +/- 2 chunks</b> of information at once. You can find more info on wikipedia to better understand what exactly is <b>chunk of information</b>
</p>
<p>
The answer to question "why developers have to write code with bugs" - because developers are (still) humans ;)
</p>
<h1>Summary</h1>
<p class="akapit">
I hope that this "complexity adventure" will be helpful to those who are curious how their code obtain "arrow shape" and where complexity is lurking. Also we checked how usage of FP influences Cyclomatic Complexity and a limitation of procedural metrics in FP world. It opens up an interesting question about how to measure complexity of functional code. Very often functional constructs build solution without <i>ifs</i> and <i>fors</i> which are base for CC calculation.
</p>
<p>
What can we do? <br/>
<li>build more intuition around scala style rules</li>
<li>research functional programming complexity measurements papers</li>
</p>
<p>
Some materials for further investigation:
<ul>
<li><a href="http://doc.utwente.nl/64090/1/Berg_1992.pdf">http://doc.utwente.nl/64090/1/Berg_1992.pdf</a> </li>
<li> <a href="https://books.google.pl/books?id=p0yV1sHLubcC&pg=PA33&lpg=PA33&dq=functional+programming+complexity+metrics&source=bl&ots=x62oAXZJkn&sig=nHYNO_PsA6qdTaum78hcAtxGza0&hl=pl&sa=X&ved=0ahUKEwjT27_UgtfJAhVipnIKHe0eB_QQ6AEIVTAH#v=onepage&q=functional%20programming%20complexity%20metrics&f=false">Measuring Haskell</a> </li>
<li> <a href="http://core.ac.uk/download/files/57/92138.pdf">SOFTWARE MEASUREMENT FOR FUNCTIONAL PROGRAMMING</a> </li>
<li> <a href="http://doc.utwente.nl/57883/1/thesis_K_van_den__Berg.pdf">Software Measurement and Functional Programming </a> </li>
</ul>
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIm1sIxqJzLtqqIrScXK4hUdQQvR_Y45SzfBe09RmO-LQ36B-aFPlQU06Q9EOEcEH9XHBenJE1j8sgIC-to6hnZxaDFWBME-RqCGooe-B9ShR_mQfSCbhavMPLftUnb-jAjwoCdLC1L8Zt/s1600/SDC11480.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiIm1sIxqJzLtqqIrScXK4hUdQQvR_Y45SzfBe09RmO-LQ36B-aFPlQU06Q9EOEcEH9XHBenJE1j8sgIC-to6hnZxaDFWBME-RqCGooe-B9ShR_mQfSCbhavMPLftUnb-jAjwoCdLC1L8Zt/s640/SDC11480.JPG" /></a></div>Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com0tag:blogger.com,1999:blog-1906126944535337964.post-61210550400987965052015-11-09T00:38:00.002+01:002015-11-09T12:06:13.841+01:00Understanding composition<p class="akapit">
<i>"Composition over inheritance"</i> - I think I've heard this for the first time around 2006 when programming in Java 5 with generics was still a luxury and everything was literally <b>"an Object"</b>. I remember an enormous shock after reading "Head First design patterns" when my mind started understanding a difference between <i>OOP</i> and <i>procedural programming</i>.
This was a shock because as a Junior I had to work with large classes (with over 1000 lines per method) called <i>SomethingManager</i> or <i>SomethingHelper</i> - and then "the book" stated "well... this is wrong.. it's not Object Oriented Programming"</p>
<p class="akapit">
A decade later - we can hear that another approach called Functional Programming promises better composition that anything you tried before. Normally we would write couple "hello worlds" to gain better intuition, gradually gather new knowledge and make some judgement.
</p>
<p class="akapit">
Yet... with FP something magical is hidden underneath - something marginalized yet very powerful,
something scary yet very helpful and finally something which is denied but exists everywhere.
</p>
<p class="akapit">
But let start with something simple...
</p>
<h1>What does it mean "to compose" ?</h1>
<p class="akapit">
Literally "Composition" can be understood as an usage of N simple elements to create more complex construction. For example a tap below is a composition of some elements, It is technically
working, passed acceptance criteria and is a good metaphor of many IT project which I saw in my life.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEielm_81rYp1UMjM279iIiZeSYrkWGnKvcX1k7V6ILR9IAlcTIFHjRu78iPsKJzdzkLqSKyf9luoXgrGzo5FC66tt9fWCXflhnn5H8v7fd6BECvjYz7F2cuivD2cc5aj-Xh_QQTB2tHBDRl/s1600/kran.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEielm_81rYp1UMjM279iIiZeSYrkWGnKvcX1k7V6ILR9IAlcTIFHjRu78iPsKJzdzkLqSKyf9luoXgrGzo5FC66tt9fWCXflhnn5H8v7fd6BECvjYz7F2cuivD2cc5aj-Xh_QQTB2tHBDRl/s400/kran.jpg" /></a></div>
<p class="akapit">
Of course we have an intuition that although this mechanism is working something is wrong with it. So let's think a little bit about composition in generally - can we answer a question about when things are easy to compose and how FP can help us with it. Writing finally some code - if for example I have a function in <b>Java</b>.
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Integer</span>, <span style="color:#a71d5d;font-style:italic">Integer</span>></span> f1 <span style="color:#794938">=</span> i <span style="color:#794938">-</span><span style="color:#794938">></span> i <span style="color:#794938">+</span> <span style="color:#811f24;font-weight:700">1</span>;
</pre>
<p>
I have an <i>Integer</i> as an input and <i>Integer</i> as an output. So we could use this function with anything what ends with <i>Integer</i> or starts with <i>Integer</i> like this
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Integer</span>, <span style="color:#a71d5d;font-style:italic">String</span>></span> f2 <span style="color:#794938">=</span> i <span style="color:#794938">-</span><span style="color:#794938">></span> <span style="color:#0b6125">"composition is beautiful : "</span> <span style="color:#794938">+</span> i;
<span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Integer</span>, <span style="color:#a71d5d;font-style:italic">String</span>></span> composed <span style="color:#794938">=</span> f1<span style="color:#794938">.</span>andThen(f2);
</pre>
<p>
Or we could literally call this operation a <b>composition</b>
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Integer</span>, <span style="color:#a71d5d;font-style:italic">String</span>></span> alsoComposed <span style="color:#794938">=</span> f2<span style="color:#794938">.</span>compose(f1);
<span style="color:#a71d5d;font-style:italic">System</span><span style="color:#794938">.</span>out<span style="color:#794938">.</span>println(composed<span style="color:#794938">.</span>apply(<span style="color:#811f24;font-weight:700">1</span>)); <span style="color:#5a525f;font-style:italic">//composition is beautiful : 2</span>
<span style="color:#a71d5d;font-style:italic">System</span><span style="color:#794938">.</span>out<span style="color:#794938">.</span>println(alsoComposed<span style="color:#794938">.</span>apply(<span style="color:#811f24;font-weight:700">1</span>)); <span style="color:#5a525f;font-style:italic">//composition is beautiful : 2</span>
</pre>
<p>
Before Java8 when there was no concept of a function as a value we would have to treat them like standard methods and invocation would look like this :
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">Integer</span> result1 <span style="color:#794938">=</span> f1<span style="color:#794938">.</span>apply(<span style="color:#811f24;font-weight:700">1</span>);
<span style="color:#a71d5d;font-style:italic">String</span> finalResult <span style="color:#794938">=</span> f2<span style="color:#794938">.</span>apply(result1);
</pre>
<p>
It is still a form of composition but we can feel that more work is needed here. This was Java - now let's do the same in <b>Scala</b> which is more expressive
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">f1</span>:<span style="color:#a71d5d;font-style:italic">Int</span>=><span style="color:#a71d5d;font-style:italic">Int</span>= i=>i+<span style="color:#811f24;font-weight:700">1</span>
<span style="color:#794938">val</span> <span style="color:#bf4f24">f2</span>:<span style="color:#a71d5d;font-style:italic">Int</span>=><span style="color:#a71d5d;font-style:italic">String</span> = i=> <span style="color:#0b6125">"composition is beautiful : "</span> + i
<span style="color:#794938">val</span> <span style="color:#bf4f24">composed</span>=f2 compose f1
composed(<span style="color:#811f24;font-weight:700">1</span>) <span style="color:#5a525f;font-style:italic">// res0: String = composition is beautiful : 2</span>
</pre>
<p>
It looks nice with simple numbers but to gain better intuition let's simulate more real life example.
</p>
<h2>More real life example </h2>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">type</span> <span style="color:#bf4f24">Money</span>=<span style="color:#a71d5d;font-style:italic">Double</span> <span style="color:#5a525f;font-style:italic">//yes it's very bad and evil</span>
<span style="color:#794938">type</span> <span style="color:#bf4f24">DataBase</span> = <span style="color:#bf4f24">Map</span>[<span style="color:#a71d5d;font-style:italic">Int</span>,<span style="color:#bf4f24">User</span>]
<span style="color:#794938">case</span> <span style="color:#794938">class</span> <span style="color:#bf4f24">User</span>(<span style="color:#794938">val</span> <span style="color:#bf4f24">name</span>:<span style="color:#a71d5d;font-style:italic">String</span>,<span style="color:#794938">val</span> <span style="color:#bf4f24">salaryNet</span>:<span style="color:#bf4f24">Money</span>)
</pre>
<p>
This will be our model with very unprofessional usage of Double as Money. Let's create mock database with some records of keys and users
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">database</span>:<span style="color:#bf4f24">DataBase</span>=<span style="color:#bf4f24">Map</span>(
<span style="color:#811f24;font-weight:700">1</span>-><span style="color:#bf4f24">User</span>(<span style="color:#0b6125">"Stefan"</span>,<span style="color:#811f24;font-weight:700">10.0</span>),
<span style="color:#811f24;font-weight:700">2</span>-><span style="color:#bf4f24">User</span>(<span style="color:#0b6125">"Zdzislawa"</span>,<span style="color:#811f24;font-weight:700">15.0</span>),
<span style="color:#811f24;font-weight:700">3</span>-><span style="color:#bf4f24">User</span>(<span style="color:#0b6125">"Boromir"</span>,<span style="color:#811f24;font-weight:700">20.0</span>)
)
</pre>
<p>
And finally actors of this spectacle - three simple functions which we are going to compose together:
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">lookup</span>:<span style="color:#bf4f24">DataBase</span>=><span style="color:#a71d5d;font-style:italic">Int</span>=><span style="color:#bf4f24">User</span> = db=>key => db(key)
<span style="color:#794938">val</span> <span style="color:#bf4f24">salary</span>:<span style="color:#bf4f24">User</span>=><span style="color:#bf4f24">Money</span>= _.salaryNet
<span style="color:#794938">val</span> <span style="color:#bf4f24">net</span>:<span style="color:#bf4f24">Money</span>=><span style="color:#bf4f24">Money</span>= _ * <span style="color:#811f24;font-weight:700">1.23</span>
</pre>
<h1>Perfect World Style</h1>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#5a525f;font-style:italic">//perfect word style</span>
<span style="color:#794938">val</span> <span style="color:#bf4f24">composedLogic</span>: (<span style="color:#a71d5d;font-style:italic">Int</span>) => <span style="color:#bf4f24">Money</span> =net compose salary compose lookup(database)
composedLogic(<span style="color:#811f24;font-weight:700">1</span>)
composedLogic(<span style="color:#811f24;font-weight:700">2</span>)
composedLogic(<span style="color:#811f24;font-weight:700">3</span>)
<span style="color:#5a525f;font-style:italic">//res1: Money = 12.3</span>
<span style="color:#5a525f;font-style:italic">//res2: Money = 18.45</span>
<span style="color:#5a525f;font-style:italic">//res3: Money = 24.6</span>
</pre>
<p>
Isn't it beautiful and simple? We took all parts and now we have fully functional program... but really have we? What if reality hit us with inpurity?
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#5a525f;font-style:italic">//problem</span>
<span style="color:#5a525f;font-style:italic">//composedLogic(4)</span>
<span style="color:#5a525f;font-style:italic">/// java.util.NoSuchElementException: key not found: 4</span>
</pre>
<p>
Most likely we had this problem at some point in our professional life - it's time to use the infamous "If" solution
</p>
<h1>When composition is difficult - the "dark ages" approach</h1>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQqT2OtkOsWtdp-4VLBwmq68BlE-YIv9rwuo36611SO-V8bDCphOMqAeAjH4bUC6-RH86OFQfSb5WmP6pihGSoASxjCbaxQ3KBWqlTkuA6soqOKiGOQaLkuxQgN93gbUTJekwwvrCr2L3w/s1600/dark.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQqT2OtkOsWtdp-4VLBwmq68BlE-YIv9rwuo36611SO-V8bDCphOMqAeAjH4bUC6-RH86OFQfSb5WmP6pihGSoASxjCbaxQ3KBWqlTkuA6soqOKiGOQaLkuxQgN93gbUTJekwwvrCr2L3w/s400/dark.jpg" /></a></div>
<p>
Let's handle missing case in a classical DAO style
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">lookupDarkAges</span>:<span style="color:#bf4f24">DataBase</span>=><span style="color:#a71d5d;font-style:italic">Int</span>=><span style="color:#bf4f24">User</span> = db=>key =>
<span style="color:#794938">if</span>(db.contains(key)) db(key) <span style="color:#794938">else</span> <span style="color:#811f24;font-weight:700">null</span>
</pre>
<p>
For now looks fine. Let's do the composition :
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">composedLogicDark</span>: (<span style="color:#a71d5d;font-style:italic">Int</span>) => <span style="color:#bf4f24">Money</span> =net compose salary compose lookupDarkAges(database)
</pre>
<p>
It compiles... can it be IT? Well no, of course not ->
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#5a525f;font-style:italic">//composedLogicDark(4)</span>
<span style="color:#5a525f;font-style:italic">//composedLogicDark: Int => Money = <function1></span>
<span style="color:#5a525f;font-style:italic">//AND the Stack Trace is SWEET</span>
<span style="color:#5a525f;font-style:italic">//java.lang.NullPointerException</span>
<span style="color:#5a525f;font-style:italic">//at pl.pawelwlodarski.fp.math.one.A$A109$A$A109$$anonfun$salary$1.apply(CompositionScala.sc0.tmp:19)</span>
</pre>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7CsvOnFEoTxMISIZkNyxph0OmIPuwHlDu5ajAiZs1KBC1HWJqw_5ERWHhX174EwI-tY4h8khaZouCNNqoSJUdgMGc6_FMKOyn60ciNbj751SsTlzIMKsZPfxRkuXUsJf5A7Tx3XR0-MH7/s1600/badluck.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg7CsvOnFEoTxMISIZkNyxph0OmIPuwHlDu5ajAiZs1KBC1HWJqw_5ERWHhX174EwI-tY4h8khaZouCNNqoSJUdgMGc6_FMKOyn60ciNbj751SsTlzIMKsZPfxRkuXUsJf5A7Tx3XR0-MH7/s400/badluck.jpg" /></a></div>
<p>
Now let understand why this don't compose. To safely use lookup function in Dark Ages style we need to adapt usage of another function by manually checking for null value and in real life this is how the <b>arrow code</b> is created
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">lookupResult</span> = lookupDarkAges(database)(<span style="color:#811f24;font-weight:700">4</span>)
<span style="color:#794938">if</span>(lookupResult!=<span style="color:#811f24;font-weight:700">null</span>){
<span style="color:#794938">val</span> <span style="color:#bf4f24">salaryResult</span>=salary(lookupResult)
<span style="color:#794938">if</span>(salaryResult!=<span style="color:#811f24;font-weight:700">null</span>){
net(salaryResult)
}
}
</pre>
<p>But now with full power of FP we can use another approach </p>
<h1>the Scary "M" approach - "ehhh another Option example..."</h1>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#5a525f;font-style:italic">//scary "M" style</span>
<span style="color:#794938">val</span> <span style="color:#bf4f24">lookupScaryM</span>:<span style="color:#bf4f24">DataBase</span>=><span style="color:#a71d5d;font-style:italic">Int</span>=><span style="color:#bf4f24">Option</span>[<span style="color:#bf4f24">User</span>] = db=>key =>
<span style="color:#794938">if</span>(db.contains(key)) <span style="color:#bf4f24">Some</span>(db(key)) <span style="color:#794938">else</span> <span style="color:#811f24;font-weight:700">None</span>
<span style="color:#5a525f;font-style:italic">//or just db.get(key) but let's keep it similar to previous examples</span>
</pre>
<p> There are thousands of code examples on internet about Option construct but let's explain this for people who may don't know. Our methods now returns a wrapper which can be modified by special mechanism which receives some lambda/function as an argument. That's why we can call it high order function and from version 8 it can be also used in Java.</p>
<p>
It allow us to use functional composition in context of real world problems like missing values and it protects us from null checks and arrow code.
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">stillPerfectComposition</span>: (<span style="color:#bf4f24">User</span>) => <span style="color:#bf4f24">Money</span> =net compose salary
<span style="color:#794938">val</span> <span style="color:#bf4f24">composedScaryM</span> :<span style="color:#a71d5d;font-style:italic">Int</span> => <span style="color:#bf4f24">Option</span>[<span style="color:#bf4f24">Money</span>]=
key => lookupScaryM(database)(key) map stillPerfectComposition
composedScaryM(<span style="color:#811f24;font-weight:700">1</span>)
<span style="color:#5a525f;font-style:italic">//res4: Option[Money] = Some(12.3)</span>
composedScaryM(<span style="color:#811f24;font-weight:700">4</span>)
<span style="color:#5a525f;font-style:italic">//res5: Option[Money] = None</span>
</pre>
<p>And now we can contact real world somewhere over there </p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#5a525f;font-style:italic">//Leave mathematical world</span>
<span style="color:#794938">case</span> <span style="color:#794938">class</span> <span style="color:#bf4f24">HttpMessage</span>(code:<span style="color:#a71d5d;font-style:italic">Int</span>,bod:<span style="color:#a71d5d;font-style:italic">String</span>)
<span style="color:#794938">val</span> <span style="color:#bf4f24">error</span>=<span style="color:#bf4f24">HttpMessage</span>(<span style="color:#811f24;font-weight:700">404</span>,<span style="color:#0b6125">"<div>I'm so sorry for this error</div>"</span>)
<span style="color:#794938">val</span> <span style="color:#bf4f24">success</span>=(salary:<span style="color:#bf4f24">Money</span>)=><span style="color:#bf4f24">HttpMessage</span>(<span style="color:#811f24;font-weight:700">200</span>,s<span style="color:#0b6125">"<div> salaray : $salary</div>"</span>)
<span style="color:#794938">val</span> <span style="color:#bf4f24">httpMessage</span> = result.fold(error)(success)
<span style="color:#5a525f;font-style:italic">// HttpMessage(200,<div> salaray : 12.3</div>)</span>
</pre>
<h1>Java example</h1>
<p> For Java fans </p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#a71d5d;font-style:italic">Map<<span style="color:#a71d5d;font-style:italic">Integer</span>, <span style="color:#a71d5d;font-style:italic">User</span>></span> database <span style="color:#794938">=</span> <span style="color:#794938">new</span> <span style="color:#a71d5d;font-style:italic">HashMap<></span>();
database<span style="color:#794938">.</span>put(<span style="color:#811f24;font-weight:700">1</span>, <span style="color:#794938">new</span> <span style="color:#a71d5d;font-style:italic">User</span>(<span style="color:#0b6125">"Stefan"</span>, <span style="color:#811f24;font-weight:700">10.0</span>));
database<span style="color:#794938">.</span>put(<span style="color:#811f24;font-weight:700">2</span>, <span style="color:#794938">new</span> <span style="color:#a71d5d;font-style:italic">User</span>(<span style="color:#0b6125">"Zdzislawa"</span>, <span style="color:#811f24;font-weight:700">15.0</span>));
<span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Map<<span style="color:#a71d5d;font-style:italic">Integer</span>, <span style="color:#a71d5d;font-style:italic">User</span>></span>, <span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Integer</span>, <span style="color:#a71d5d;font-style:italic">Optional<<span style="color:#a71d5d;font-style:italic">User</span>></span>></span>></span> lookup <span style="color:#794938">=</span> db <span style="color:#794938">-</span><span style="color:#794938">></span> key <span style="color:#794938">-</span><span style="color:#794938">></span> {
<span style="color:#a71d5d;font-style:italic">User</span> u <span style="color:#794938">=</span> db<span style="color:#794938">.</span>get(key);
<span style="color:#794938">return</span> <span style="color:#a71d5d;font-style:italic">Optional</span><span style="color:#794938">.</span>ofNullable(u);
};
<span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">User</span>, <span style="color:#a71d5d;font-style:italic">Double</span>></span> salary <span style="color:#794938">=</span> <span style="color:#a71d5d;font-style:italic">User</span><span style="color:#794938">:</span><span style="color:#794938">:</span>getSalaryNet;
<span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Double</span>, <span style="color:#a71d5d;font-style:italic">Double</span>></span> gross <span style="color:#794938">=</span> money <span style="color:#794938">-</span><span style="color:#794938">></span> money <span style="color:#794938">*</span> <span style="color:#811f24;font-weight:700">1.23</span>;
<span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">User</span>, <span style="color:#a71d5d;font-style:italic">Double</span>></span> perfectComposition <span style="color:#794938">=</span> gross<span style="color:#794938">.</span>compose(salary);
<span style="color:#a71d5d;font-style:italic">Function<<span style="color:#a71d5d;font-style:italic">Integer</span>,<span style="color:#a71d5d;font-style:italic">Optional<<span style="color:#a71d5d;font-style:italic">Double</span>></span>></span> program <span style="color:#794938">=</span>key <span style="color:#794938">-</span><span style="color:#794938">></span> lookup<span style="color:#794938">.</span>apply(database)<span style="color:#794938">.</span>apply(key)<span style="color:#794938">.</span>map(perfectComposition);
<span style="color:#a71d5d;font-style:italic">System</span><span style="color:#794938">.</span>out<span style="color:#794938">.</span>println(program<span style="color:#794938">.</span>apply(<span style="color:#811f24;font-weight:700">1</span>));
<span style="color:#a71d5d;font-style:italic">System</span><span style="color:#794938">.</span>out<span style="color:#794938">.</span>println(program<span style="color:#794938">.</span>apply(<span style="color:#811f24;font-weight:700">4</span>));
<span style="color:#5a525f;font-style:italic">//Optional[12.3]</span>
<span style="color:#5a525f;font-style:italic">//Optional.empty</span>
</pre>
<p>Ok but I promised something powerful. <i>Option</i> is very useful but it is only part of something bigger, a lot bigger... </p>
<h1>The "F" word and another scary "M" word</h1>
<p>
There is something scary hidden under <a href="https://en.wikipedia.org/wiki/Functor">THIS MYSTERIOUS LINK</a>. Just a moment ago I stated that <i>Option</i> is just a wrapper - I lied (to some point). <b>"A wrapper"</b> is an old way of thinking - we can look at it from a very different perspective and see that we actually used power of <i>Type System</i> to signalize possibility of a missing value.
</p>
<p>If you entered the mysterious link you saw a page which looks maybe scary but for sure is not "cool". It leads to a wiki page about <b>Functor</b> and open gates to another scary concept denied by most of programmers something called... Mathematics </p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisKoK4VlfCk94pgLNNE6IeC9wReDsEOLjWDi2hA8ttL4QGNVQLa6YySPQ_a3XtyOW21CDicXGPbwGXeSVjm14Ofstf1lnQPN1rB2xanXBCuRKDZkd-OJtPwfwudUXfKwsscml70iWbd7Nd/s1600/functor.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEisKoK4VlfCk94pgLNNE6IeC9wReDsEOLjWDi2hA8ttL4QGNVQLa6YySPQ_a3XtyOW21CDicXGPbwGXeSVjm14Ofstf1lnQPN1rB2xanXBCuRKDZkd-OJtPwfwudUXfKwsscml70iWbd7Nd/s640/functor.jpg" /></a></div>
<p>
It may look scary but you have already used it - yes by using <i>Option</i>. My interpretation may not be accepted by mathematician but generally we can see there that Functor changes something like this <b>X => Y</b> into <b>F(X) => F(Y)</b>. In our example it changed <b>USER=>MONEY</b> into <b>OPTION(USER) => OPTION(MONEY)</b>
</p>
<p>
And this is what <i>map</i> method on Option does. Imagine it as a standalone function: <b>map(A=>B): Option[A] => Option[B]</b>. Ok great, but does it have any practical value? You know in every scala presentation everyone carefully states "you don't need to know category theory". But when you write program how do you know you are doing it "correctly" ?
</p>
<h1>"CUSTOM" Driven Development</h1>
<p class="akapit">
Once upon a time a guy lived in Scotland and his philosophy became one of the most important though movements in history - this guy was <a href="https://pl.wikipedia.org/wiki/David_Hume">David Hume</a> and according to his words people in generally organize their lives around <b>set of Customs</b>. "I don't have proof that tomorrow sun will rise but it happened 10000 times before so what the hell - I will assume that tomorrow situation will repeat".
</p>
<p class="akapit">
I'm wondering if we are not doing the same thing with <i>Computer Science</i> which in most areas is not science anymore but became a "set of principles created by few and adapted by many".
Look - OOP is not a science - even Alan Kay who created this term doesn't have strict definition ---> <a href="http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en">http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/doc_kay_oop_en</a>
</p>
<p>
Most discussion about OOP which my inexperienced eyes saw and most discussion which may novice ears heard happened according to schema : <i>My Opinion vs. your Opinion</i>. This maybe convenient for drinking beer but maybe not so useful for engineering science. At the same time mentioned <b>Functor</b> is a part of bigger theory and it has some <b>formal laws</b> - yes, formal laws. Moreover "Scalaz" - a Functional library for scala - is shipped with prepared automated tests in scalacheck to check if your implementation of Functor from your domain is correct according to thoseformal laws!
</p>
<p>
This is really very new approach. After decade of "Opinion based development" this is something really new for me. It seems very promising but we are living in strange times when "being a programmer" is very blurry concept. For some you are an artist and for some you are just a fucking resource.
</p>
<p>But let's return to something more pelasent </p>
<h1> Functional purity - Haskell Kindergarden </h1>
<p>
This is as far as I can get with Haskell today but using a pure functional language where you can not cheat is a very interesting exercise. As a warm up let's do a simple composition.
in Haskell you are composing by <b>"dot"</b> - <b>-->.<--</b> (by "dot"? This is stupid! - In Java your are calling method by "dot"... -Ok.. continue)
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#811f24;font-weight:700">Prelude</span><span style="color:#794938">></span> <span style="color:#794938">let</span> f1 i <span style="color:#794938">=</span>i<span style="color:#794938">+</span><span style="color:#811f24;font-weight:700">1</span>
<span style="color:#811f24;font-weight:700">Prelude</span><span style="color:#794938">></span> <span style="color:#794938">let</span> f2 i<span style="color:#794938">=</span> <span style="color:#0b6125">"composition is beautiful : "</span> <span style="color:#794938">++</span> <span style="color:#693a17">show</span> i
<span style="color:#811f24;font-weight:700">Prelude</span><span style="color:#794938">></span> <span style="color:#794938">let</span> f3<span style="color:#794938">=</span> f2 <span style="color:#794938">.</span> f1
<span style="color:#811f24;font-weight:700">Prelude</span><span style="color:#794938">></span> f3 <span style="color:#811f24;font-weight:700">1</span>
<span style="color:#0b6125">"composition is beautiful : 2"</span>
</pre>
<p>You can declare type aliases similarly to the way we are doing it in Scala (or in the other way - I believe Haskell was first) </p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#811f24;font-weight:700">Prelude</span><span style="color:#794938">></span> <span style="color:#794938">type</span> <span style="color:#811f24;font-weight:700">Money</span><span style="color:#794938">=</span><span style="color:#811f24;font-weight:700">Double</span>
<span style="color:#811f24;font-weight:700">Prelude</span><span style="color:#794938">></span> <span style="color:#794938">data</span> <span style="color:#811f24;font-weight:700">User</span> <span style="color:#794938">=</span> <span style="color:#811f24;font-weight:700">User</span> {name <span style="color:#794938">::</span> <span style="color:#811f24;font-weight:700">String</span><span style="color:#794938">,</span> salaryNet <span style="color:#794938">::</span> <span style="color:#811f24;font-weight:700">Money</span>} <span style="color:#794938">deriving</span> (<span style="color:#bf4f24">Show</span>)
<span style="color:#811f24;font-weight:700">Prelude</span><span style="color:#794938">></span> <span style="color:#794938">import</span> <span style="color:#794938">qualified</span> <span style="color:#691c97">Data.Map</span> <span style="color:#794938">as</span> <span style="color:#691c97">Map</span>
<span style="color:#811f24;font-weight:700">Prelude</span> <span style="color:#811f24;font-weight:700">Map</span><span style="color:#794938">></span> <span style="color:#794938">type</span> <span style="color:#811f24;font-weight:700">DataBase</span> <span style="color:#794938">=</span> <span style="color:#811f24;font-weight:700">Map</span> <span style="color:#811f24;font-weight:700">Int</span> <span style="color:#811f24;font-weight:700">User</span>
<span style="color:#811f24;font-weight:700">Prelude</span> <span style="color:#811f24;font-weight:700">Map</span><span style="color:#794938">></span> <span style="color:#794938">let</span> database <span style="color:#794938">=</span> fromList [(<span style="color:#811f24;font-weight:700">1</span><span style="color:#794938">,</span><span style="color:#811f24;font-weight:700">User</span> <span style="color:#0b6125">"Stefan"</span> <span style="color:#811f24;font-weight:700">10.0</span>)<span style="color:#794938">,</span>(<span style="color:#811f24;font-weight:700">2</span><span style="color:#794938">,</span><span style="color:#811f24;font-weight:700">User</span> <span style="color:#0b6125">"Zdzislawa"</span> <span style="color:#811f24;font-weight:700">15.0</span>)]
</pre>
<p>First Interesting this is that I was unable to create "Dark Ages" example in Haskell because... there is no null. (there is a function called "null" which checks if list is empty).
And Option in Haskell is called "Maybe" and it has two values "Nothing" and "Just"</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">let</span> ourLookup database key <span style="color:#794938">=</span> <span style="color:#811f24;font-weight:700">Map</span><span style="color:#794938">.</span><span style="color:#693a17">lookup</span> key database
<span style="color:#811f24;font-weight:700">Prelude</span> <span style="color:#811f24;font-weight:700">Map</span><span style="color:#794938">></span> <span style="color:#794938">:</span>t ourLookup
<span style="color:#bf4f24">ourLookup</span> <span style="color:#794938">::</span> <span style="color:#a71d5d;font-style:italic">Ord</span> <span style="color:#234a97">k</span> <span style="color:#794938">=></span> <span style="color:#a71d5d;font-style:italic">Map</span> <span style="color:#234a97">k</span> <span style="color:#234a97">a</span> <span style="color:#794938">-></span> <span style="color:#234a97">k</span> <span style="color:#794938">-></span> <span style="color:#691c97">Maybe</span> <span style="color:#234a97">a</span>
</pre>
<p>SalaryNet was generated automatically when we declared User structure </p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">:</span>t salaryNet
<span style="color:#bf4f24">salaryNet</span> <span style="color:#794938">::</span> <span style="color:#a71d5d;font-style:italic">User</span> <span style="color:#794938">-></span> <span style="color:#a71d5d;font-style:italic">Money</span>
</pre>
<p>And finally our composition </p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">let</span> grossValue money <span style="color:#794938">=</span> money * <span style="color:#811f24;font-weight:700">1.23</span>
<span style="color:#794938">let</span> perfectComposition <span style="color:#794938">=</span> grossValue <span style="color:#794938">.</span> salaryNet
<span style="color:#811f24;font-weight:700">Prelude</span> <span style="color:#811f24;font-weight:700">Data</span><span style="color:#794938">.</span><span style="color:#811f24;font-weight:700">Map</span> <span style="color:#811f24;font-weight:700">Map</span><span style="color:#794938">></span> <span style="color:#794938">:</span>t perfectComposition
<span style="color:#bf4f24">perfectComposition</span> <span style="color:#794938">::</span> <span style="color:#a71d5d;font-style:italic">User</span> <span style="color:#794938">-></span> <span style="color:#a71d5d;font-style:italic">Money</span>
</pre>
<p>
And "scaryM" approach usage
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">let</span> lookupScaryM key <span style="color:#794938">=</span> <span style="color:#693a17">fmap</span> perfectComposition (ourLookup database key)
lookupScaryM <span style="color:#811f24;font-weight:700">1</span>
<span style="color:#b4371f">Just</span> <span style="color:#811f24;font-weight:700">12.3</span>
lookupScaryM <span style="color:#811f24;font-weight:700">4</span>
<span style="color:#b4371f">Nothing</span>
</pre>
<p>
What is fmap? This is wonderful moment because this is the place where practice meets formal theory (hint: word "FUNCTOR"...)
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#811f24;font-weight:700">Prelude</span> <span style="color:#811f24;font-weight:700">Map</span><span style="color:#794938">></span> <span style="color:#794938">:</span>t <span style="color:#693a17">fmap</span>
<span style="color:#bf4f24">fmap</span> <span style="color:#794938">::</span> <span style="color:#a71d5d;font-style:italic">Functor</span> <span style="color:#234a97">f</span> <span style="color:#794938">=></span> (<span style="color:#234a97">a</span> <span style="color:#794938">-></span> <span style="color:#234a97">b</span>) <span style="color:#794938">-></span> <span style="color:#234a97">f</span> <span style="color:#234a97">a</span> <span style="color:#794938">-></span> <span style="color:#234a97">f</span> <span style="color:#234a97">b</span>
</pre>
<p>Now let's return to Scala and see some more powerful functional composition mechanisms</p>
<h1>Even more composition - summon the Dragon</h1>
<p>
Let's recall last example of our Scala composition
</p>
<pre style="background:#f9f9f9;color:#080808">val stillPerfectComposition<span style="color:#794938">:</span> (<span style="color:#811f24;font-weight:700">User</span>) <span style="color:#794938">=></span> <span style="color:#811f24;font-weight:700">Money</span> <span style="color:#794938">=</span>net compose salary
val composedScaryM <span style="color:#794938">:</span><span style="color:#811f24;font-weight:700">Int</span> <span style="color:#794938">=></span> <span style="color:#811f24;font-weight:700">Option</span>[<span style="color:#811f24;font-weight:700">Money</span>]<span style="color:#794938">=</span>
key <span style="color:#794938">=></span> lookupScaryM(database)(key) <span style="color:#693a17">map</span> stillPerfectComposition
</pre>
<p> We can summon the dragon </p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">import</span> scalaz._
<span style="color:#794938">import</span> <span style="color:#691c97">Scalaz._</span>
</pre>
<p>And use something called Kleisli Composition </p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCkkJN7kk2FmlmyqdZ46zkjwkux6D0q5qgy32Wa1A2p9JRfWtGOMKEkyXFQCPbRcoAAFhdmD2aMXMugAudZjlSS6OyPSDImPVUHqP85-JSBJh4WX2UKHbx2fFcCvnLRgQxfUv16Hl7mLBl/s1600/kalesi.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiCkkJN7kk2FmlmyqdZ46zkjwkux6D0q5qgy32Wa1A2p9JRfWtGOMKEkyXFQCPbRcoAAFhdmD2aMXMugAudZjlSS6OyPSDImPVUHqP85-JSBJh4WX2UKHbx2fFcCvnLRgQxfUv16Hl7mLBl/s400/kalesi.jpeg" /></a></div>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">findUser</span>=(database:<span style="color:#bf4f24">DataBase</span>)=><span style="color:#bf4f24">Kleisli</span>.kleisli[<span style="color:#bf4f24">Option</span>, <span style="color:#a71d5d;font-style:italic">Int</span>, <span style="color:#bf4f24">User</span>] {
(key:<span style="color:#a71d5d;font-style:italic">Int</span>) => lookupScaryM(database)(key)
}
<span style="color:#5a525f;font-style:italic">//HERE IS THIS BETTER COMPOSITION</span>
<span style="color:#794938">val</span> <span style="color:#bf4f24">program</span>=findUser(database) map stillPerfectComposition
program.run(<span style="color:#811f24;font-weight:700">1</span>)
<span style="color:#5a525f;font-style:italic">//res6: Option[Money] = Some(12.3)</span>
program.run(<span style="color:#811f24;font-weight:700">4</span>)
<span style="color:#5a525f;font-style:italic">//res7: Option[Money] = None</span>
</pre>
<p>We can even eliminate "database" parameter from composition generating functional dependency injection mechanism </p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">dependencyInjection</span> = findUser andThen(_ map stillPerfectComposition)
<span style="color:#794938">val</span> <span style="color:#bf4f24">result</span>=dependencyInjection(database)(<span style="color:#811f24;font-weight:700">1</span>)
<span style="color:#5a525f;font-style:italic">// res8: Option[Money] = Some(12.3)</span>
dependencyInjection(database)(<span style="color:#811f24;font-weight:700">4</span>)
<span style="color:#5a525f;font-style:italic">//res9: Option[Money] = None</span>
</pre>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhu6w6amiRGb2ksIlEok2gcuZFKUSOLihDeIJcTb3h2Z7KhsmejsCbrlD6ndWNYDan2ap58gdEOwASV1lIV6m-pof2B_nHp4KDkbS9LrQp-a3wxvvKffL-wSJ3KAfFkwv6uyI9bzchblPzS/s1600/underscore.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhu6w6amiRGb2ksIlEok2gcuZFKUSOLihDeIJcTb3h2Z7KhsmejsCbrlD6ndWNYDan2ap58gdEOwASV1lIV6m-pof2B_nHp4KDkbS9LrQp-a3wxvvKffL-wSJ3KAfFkwv6uyI9bzchblPzS/s400/underscore.jpg" /></a></div>
<p>
In 2004 there was a movie - <a href="http://www.imdb.com/title/tt0368447/">The village</a>. If I remember correctly it was about a group of people who was cheated that there is still XIX century while it really really XXI. I had the same feeling three years ago when I left "Java comfort zone". There are some very powerful languages over there with syntax a lot different than Java (which syntax was borrowed C++ which syntax was borrowed from C . Haskell has underscores, OCaml has underscores so maybe it's really worth to learn new way of thinking even if syntax is not from Java.
</p>Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com0tag:blogger.com,1999:blog-1906126944535337964.post-61677922632062485332015-10-25T11:17:00.001+01:002017-06-28T22:51:11.499+02:00Log your Dataframes transformations with Writer Monad<p class="akapit">
In the <a href="http://usethiscode.blogspot.com/2015/10/spark-dataframes-transformations-with.html">last article</a> we tried to use <i>State Monad</i> to build computation chain where <i>step N</i> depends on <i>step (N-1)</i>. We wanted to have a set of small and easily composable pieces of code. To escape from level of high abstraction and check our design in more practical environment we are using <b>Spark Dataframes</b> as an example and practical illustration.
</p>
<p class="akapit">
State Monad represents function which modify some state and simultaneously generates result value from it : <b>S1 => (R,S2)</b>.
In our case we treated S as a Map with Dataframes created during computation. R from equation above were just <i>Logs</i> generated during calculation of a new state.
</p>
<p class="akapit">
The problem with that solution was that we had to pass Strings between transformations to combine final <b>Log</b>. Today we are going to check different construct - <b>Writer Monad</b>. According to description it should handle this <i>"logs problem"</i> problem for us - let's see if I'm smart enough to use this tool properly ;)
</p>
<h1>Understand Writer Monad</h1>
<p>
As a preparation for our experiments let's simulate couple useful types.First on is <i>Dictionary</i> which is a form of a bag for computation results like DataFrames, and the second one <i>Log</i> is just a debug message.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span>,<span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">]</span>
<span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">String</span>
<span style="color: #008800; font-weight: bold">val</span> initialDict<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">"initial"</span><span style="color: #333333">-></span><span style="background-color: #fff0f0">"initialState"</span><span style="color: #333333">)</span>
</pre></div>
<p>
Now let's create <i>"the subject of our research"</i> , the very first instance of Monad Writer in this blog post :
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> initWriter<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Writer</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">List</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">]</span>, <span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span><span style="color: #BB0066; font-weight: bold">Writer</span><span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333">.</span>empty<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">],</span>initialDict<span style="color: #333333">)</span>
</pre></div>
<p>
Our business logic at the beginning will be represented by a method which just adds simple simple value (new state) to previously defined dictionary. Notice that what we are building here is a phase of computation with signature : <i>Dictionary=>Dictionary</i>
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">def</span> <span style="color:#bf4f24">addToDictionary</span>(key:<span style="color:#a71d5d;font-style:italic">String</span>,value:<span style="color:#bf4f24">Any</span>) : <span style="color:#bf4f24">Dictionary</span> => <span style="color:#bf4f24">Dictionary</span> = { dict =>
dict + (key -> value)
}
</pre>
<p>
We are going to start "lab reasearch" with checking simple <i>map</i> method on Writer - will it be enough?
</p>
<pre style="background:#f9f9f9;color:#080808"><span style="color:#794938">val</span> <span style="color:#bf4f24">result</span>=initWriter
.map(addToDictionary(<span style="color:#0b6125">"one"</span>,<span style="color:#811f24;font-weight:700">1</span>))
.map(addToDictionary(<span style="color:#0b6125">"two"</span>,<span style="color:#811f24;font-weight:700">true</span>))
.map(addToDictionary(<span style="color:#0b6125">"three"</span>,<span style="color:#0b6125">"value"</span>))
result.run
<span style="color:#5a525f;font-style:italic">//result is :</span>
(<span style="color:#bf4f24">List</span>(),<span style="color:#bf4f24">Map</span>(initial -> initialState, one -> <span style="color:#811f24;font-weight:700">1</span>, two -> <span style="color:#811f24;font-weight:700">true</span>, three -> value))
</pre>
<p>
so we are not satisfied because just simple map does not generate any logs. Well this is actually not a surprise because we are not generate any logs in our "business logic" either...
Let's fix it.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> addToWriter<span style="color: #333333">(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span>value<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Dictionary</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">Writer</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">List</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">]</span>,<span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">{</span> dict <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #BB0066; font-weight: bold">Writer</span><span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333">(</span>s<span style="background-color: #fff0f0">"adding $key -> $value"</span><span style="color: #333333">),</span>dict <span style="color: #333333">+</span> <span style="color: #333333">(</span>key <span style="color: #333333">-></span> value<span style="color: #333333">))</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> transformationWriter<span style="color: #008800; font-weight: bold">=</span>initWriter
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>addToWriter<span style="color: #333333">(</span><span style="background-color: #fff0f0">"one"</span><span style="color: #333333">,</span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">))</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>addToWriter<span style="color: #333333">(</span><span style="background-color: #fff0f0">"two"</span><span style="color: #333333">,</span><span style="color: #008800; font-weight: bold">true</span><span style="color: #333333">))</span>
<span style="color: #333333">.</span>mapWritten<span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">_</span> <span style="color: #333333">:+</span> <span style="background-color: #fff0f0">"additional log"</span><span style="color: #333333">)</span> <span style="color: #888888">//<--- haaa this line is interesting!!</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>addToWriter<span style="color: #333333">(</span><span style="background-color: #fff0f0">"three"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"value"</span><span style="color: #333333">))</span>
</pre></div>
<p>
In this case in each step we are building a new Writer instance in each transformation phase. Creation of a list with one element may seems a little bit awkward at the beginning but at the end all lists will be combined together in one final log. Also <i>mapWritten</i> is interesting because it gives us power to add some additional logging without touching "the computation subject"
</p>
<p>
Ok time to <b>run it</b>!
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> <span style="color: #333333">(</span>fullLog<span style="color: #333333">,</span>fullResult<span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span>transformationWriter<span style="color: #333333">.</span>run
fullLog
<span style="color: #888888">//res1: List[String] = List(adding one -> 1, adding two -> true, additional log, adding three -> value)</span>
fullResult
<span style="color: #888888">//res2: Dictionary = Map(initial -> initialState, one -> 1, two -> true, three -> value)</span>
</pre></div>
<p>looks good! (really, it looks good - this is what we were really expecting). Below you can find an illustration to betetr understand Writer mechanism and now I believe we are ready for Spark scenario</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfOOLlbidqI5zn3j5gP0EOb5BKylbC3xA8ba-PkLlpLahRu5zr4ub7qWeX_v9eYRR1dL8AEQDnDJN7FSppfCCqt8_dUVI_BHLs-M231EN41Ja0hyphenhyphenskjndv25L11HXo2ENDdvCuv7KpE1PC/s1600/writertransformation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfOOLlbidqI5zn3j5gP0EOb5BKylbC3xA8ba-PkLlpLahRu5zr4ub7qWeX_v9eYRR1dL8AEQDnDJN7FSppfCCqt8_dUVI_BHLs-M231EN41Ja0hyphenhyphenskjndv25L11HXo2ENDdvCuv7KpE1PC/s640/writertransformation.png"></a></div>
<h1>Spark use case </h1>
<p class="akapit">
In this paragraph we are going to implement a very simple but fully functional spark example. General concept is explained in a different post --> <a href="http://usethiscode.blogspot.com/2015/09/functional-data-transformation-with.html">General concept of functional data transformation</a> and it's about building transformation pipeline from small and simple functional bricks which are easy to test and compose. You can find the full code on github --> <a href="https://github.com/PawelWlodarski/blog/blob/master/src/main/scala/pl/pawelwlodarski/spark/writer/WriterSparkExample.scala">Full code : Dataframes transformations with Writer Monad</a>
</p>
<pre style="background:#f9f9f9;color:#080808"> <span style="color:#794938">type</span> <span style="color:#bf4f24">DataDictionary</span>=<span style="color:#bf4f24">Map</span>[<span style="color:#a71d5d;font-style:italic">String</span>,<span style="color:#bf4f24">Any</span>]
<span style="color:#794938">type</span> <span style="color:#bf4f24">Log</span>=<span style="color:#a71d5d;font-style:italic">String</span>
<span style="color:#794938">val</span> <span style="color:#bf4f24">addTimeStamp</span>: <span style="color:#bf4f24">DataFrame</span> => <span style="color:#bf4f24">DataFrame</span> =
{ df =>
df.withColumn(<span style="color:#0b6125">"created"</span>,current_date())
}
<span style="color:#794938">val</span> <span style="color:#bf4f24">addLabel</span>:<span style="color:#a71d5d;font-style:italic">String</span> => <span style="color:#bf4f24">DataFrame</span> => <span style="color:#bf4f24">DataFrame</span> =
{label => df =>
df.withColumn(<span style="color:#0b6125">"label"</span>,lit(label))
}
<span style="color:#794938">val</span> <span style="color:#bf4f24">businessJoin</span> : (<span style="color:#a71d5d;font-style:italic">String</span>,<span style="color:#a71d5d;font-style:italic">String</span>) => (<span style="color:#bf4f24">DataFrame</span>,<span style="color:#bf4f24">DataFrame</span>) => <span style="color:#bf4f24">DataFrame</span> =
{(column1,column2) => (df1,df2) =>
df1.join(df2, df1(column1) === df2(column2))
}
</pre>
<p>
Mentioned functional bricks are just simple functions which operate on Dataframes. Now we need to lift them to the level of our computation chain which uses <i>DataDictionary</i>. This is the moment when for the first time in our Spark example we will use <i>Writer Monad</i> - take a look :
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">DataDictionary</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">Writer</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">List</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]</span>,<span style="color: #333399; font-weight: bold">DataDictionary</span><span style="color: #333333">]</span>
<span style="color: #008800; font-weight: bold">def</span> liftToTransformation<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A:Extractor</span><span style="color: #333333">](</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">DataFrame</span><span style="color: #333333">)(</span>key1<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)(</span>resultKey<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span>
<span style="color: #333333">{</span>dictionary <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> param1 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key1<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span>f<span style="color: #333333">(</span>param1<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> log<span style="color: #008800; font-weight: bold">=</span> s<span style="background-color: #fff0f0">"\nadding $resultKey -> $result"</span>
<span style="color: #008800; font-weight: bold">val</span> newDictionary<span style="color: #008800; font-weight: bold">=</span>dictionary <span style="color: #333333">+</span> <span style="color: #333333">(</span>resultKey <span style="color: #333333">-></span> result<span style="color: #333333">)</span>
<span style="color: #BB0066; font-weight: bold">Writer</span><span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333">(</span>log<span style="color: #333333">),</span>newDictionary<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
</pre></div>
<p>
Having this lifting function we can easily create functions which operate on Datadictionary
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> addTimestampPhase<span style="color: #008800; font-weight: bold">=</span>
liftToTransformation<span style="color: #333333">(</span>addTimeStamp<span style="color: #333333">)(</span><span style="background-color: #fff0f0">"InitialFrame"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"WithTimeStamp"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> addLabelPhase<span style="color: #008800; font-weight: bold">=</span>
liftToTransformation<span style="color: #333333">(</span>addLabel<span style="color: #333333">(</span><span style="background-color: #fff0f0">"experiment"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"WithTimeStamp"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"Labelled"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> businessJoinPhase<span style="color: #008800; font-weight: bold">=</span>
liftToTransformation<span style="color: #333333">(</span>businessJoin<span style="color: #333333">(</span><span style="background-color: #fff0f0">"customerId"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"id"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"Labelled"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"SomeOtherFrame"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"JoinedByBusinessRules"</span><span style="color: #333333">)</span>
</pre></div>
<p>
And because lifted function have signature <b>Dictionary => Writer</b> we can easily compose computation chain with <i>flatMap</i> and <i>Writer</i> will take care about log composition
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> transformation1<span style="color: #333333">=(</span>dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataDictionary</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=></span>start<span style="color: #333333">(</span>dictionary<span style="color: #333333">)</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>addTimestampPhase<span style="color: #333333">)</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>addLabelPhase<span style="color: #333333">)</span>
<span style="color: #333333">.</span>mapWritten<span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">_</span> <span style="color: #333333">:+</span> <span style="background-color: #fff0f0">"before business join"</span><span style="color: #333333">)</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>businessJoinPhase<span style="color: #333333">)</span>
</pre></div>
<p>This leave us with the last missing piece of this puzzle - transformation composition. Below we have an example how we can chain first transformation with another one. </p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #888888">//transformation2</span>
<span style="color: #008800; font-weight: bold">val</span> importantSelect<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataFrame</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #008800; font-weight: bold">_</span><span style="color: #333333">.</span>select<span style="color: #333333">(</span><span style="background-color: #fff0f0">"customerId"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"credit"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"label"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"created"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> importantSelectPhase <span style="color: #008800; font-weight: bold">=</span>liftToTransformation<span style="color: #333333">(</span>importantSelect<span style="color: #333333">)(</span><span style="background-color: #fff0f0">"JoinedByBusinessRules"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"BusinessReport"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> transformation2<span style="color: #333333">=(</span>dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataDictionary</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=></span>start<span style="color: #333333">(</span>dictionary<span style="color: #333333">)</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>importantSelectPhase<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> transformationComposed<span style="color: #333333">=(</span>dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataDictionary</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=></span>start<span style="color: #333333">(</span>dictionary<span style="color: #333333">)</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>transformation1<span style="color: #333333">)</span>
<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>transformation2<span style="color: #333333">)</span>
</pre></div>
<h2> Full Working Example </h2>
<p>As I mentioned before you can find all code on github however for all 99% who will never go there - here is the main function :)
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> main<span style="color: #333333">(</span>args<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Array</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">])</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">val</span> config<span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">SparkConf</span><span style="color: #333333">().</span>setMaster<span style="color: #333333">(</span><span style="background-color: #fff0f0">"local[4]"</span><span style="color: #333333">).</span>setAppName<span style="color: #333333">(</span><span style="background-color: #fff0f0">"Dataframes transformation with State Monad"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> sc<span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">SparkContext</span><span style="color: #333333">(</span>config<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> sqlContext<span style="color: #008800; font-weight: bold">=new</span> <span style="color: #BB0066; font-weight: bold">SQLContext</span><span style="color: #333333">(</span>sc<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">sqlContext.implicits._</span>
println<span style="color: #333333">(</span><span style="background-color: #fff0f0">"example start"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> df1<span style="color: #008800; font-weight: bold">=</span>sc<span style="color: #333333">.</span>parallelize<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Seq</span><span style="color: #333333">(</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"cust1@gmail.com"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"Stefan"</span><span style="color: #333333">),</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">2</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"cust2@gmail.com"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"Zdzislawa"</span><span style="color: #333333">),</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">3</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"cust3@gmail.com"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"Bonifacy"</span><span style="color: #333333">),</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">4</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"cust4@gmail.com"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"Bozebozebozenka"</span><span style="color: #333333">)</span>
<span style="color: #333333">)).</span>toDF<span style="color: #333333">(</span><span style="background-color: #fff0f0">"customerId"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"email"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"name"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> df2<span style="color: #008800; font-weight: bold">=</span>sc<span style="color: #333333">.</span>parallelize<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Seq</span><span style="color: #333333">(</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">,</span><span style="color: #0000DD; font-weight: bold">10</span><span style="color: #333333">),</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">2</span><span style="color: #333333">,</span><span style="color: #0000DD; font-weight: bold">20</span><span style="color: #333333">),</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">3</span><span style="color: #333333">,</span><span style="color: #0000DD; font-weight: bold">30</span><span style="color: #333333">),</span>
<span style="color: #333333">(</span><span style="color: #0000DD; font-weight: bold">4</span><span style="color: #333333">,</span><span style="color: #0000DD; font-weight: bold">40</span><span style="color: #333333">)</span>
<span style="color: #333333">)).</span>toDF<span style="color: #333333">(</span><span style="background-color: #fff0f0">"id"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"credit"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataDictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">"InitialFrame"</span> <span style="color: #333333">-></span> df1<span style="color: #333333">,</span><span style="background-color: #fff0f0">"SomeOtherFrame"</span><span style="color: #333333">-></span>df2<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> <span style="color: #333333">(</span>log<span style="color: #333333">,</span>resultDictionary<span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span>transformationComposed<span style="color: #333333">(</span>dictionary<span style="color: #333333">).</span>run
println<span style="color: #333333">(</span><span style="background-color: #fff0f0">"**************LOG*************** : "</span><span style="color: #333333">+</span>log<span style="color: #333333">)</span>
println<span style="color: #333333">(</span><span style="background-color: #fff0f0">"**************DICTIONARY********"</span><span style="color: #333333">)</span>
resultDictionary<span style="color: #333333">.</span>foreach<span style="color: #333333">(</span>println<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span>resultDictionary<span style="color: #333333">(</span><span style="background-color: #fff0f0">"BusinessReport"</span><span style="color: #333333">).</span>asInstanceOf<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">]</span>
result<span style="color: #333333">.</span>show<span style="color: #333333">()</span>
<span style="color: #333333">}</span>
</pre></div>
And this give us the <b>result in console</b>:
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">//In list log we have every step logged - along with the message added in mapWritten
**************LOG*************** : List<span style="color: #333333">(</span>
adding WithTimeStamp -> <span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date<span style="color: #333333">]</span>,
adding Labelled -> <span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string<span style="color: #333333">]</span>,
before business join,
adding JoinedByBusinessRules -> <span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string, id: int, credit: int<span style="color: #333333">]</span>,
adding BusinessReport -> <span style="color: #333333">[</span>customerId: int, credit: int, label: string, created: date<span style="color: #333333">])</span>
//If you want to analyze this example in detail <span style="color: #008800; font-weight: bold">then </span>you are going to find reference to each used DataFrame in DataDictionary
**************DICTIONARY********
<span style="color: #333333">(</span>JoinedByBusinessRules,<span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string, id: int, credit: int<span style="color: #333333">])</span>
<span style="color: #333333">(</span>WithTimeStamp,<span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date<span style="color: #333333">])</span>
<span style="color: #333333">(</span>BusinessReport,<span style="color: #333333">[</span>customerId: int, credit: int, label: string, created: date<span style="color: #333333">])</span>
<span style="color: #333333">(</span>InitialFrame,<span style="color: #333333">[</span>customerId: int, email: string, name: string<span style="color: #333333">])</span>
<span style="color: #333333">(</span>SomeOtherFrame,<span style="color: #333333">[</span>id: int, credit: int<span style="color: #333333">])</span>
<span style="color: #333333">(</span>Labelled,<span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string<span style="color: #333333">])</span>
//And our final result - nicely formatted by Spark - looks as expected
+----------+------+----------+----------+
|customerId|credit| label| created|
+----------+------+----------+----------+
| 1| 10|experiment|2015-10-24|
| 2| 20|experiment|2015-10-24|
| 3| 30|experiment|2015-10-24|
| 4| 40|experiment|2015-10-24|
+----------+------+----------+----------+
</pre></div>
<h2>One more possibility</h2>
<p>
If you look deeper into Writer implementation in ScalaZ you will find a very very interesting declaration which looks like this :
</p>
<pre style="background:#f9f9f9;color:#080808"> <span style="color:#234a97">type</span> <span style="color:#234a97">Writer</span>[<span style="color:#234a97">W</span>, <span style="color:#234a97">A</span>] <span style="color:#794938">=</span> <span style="color:#234a97">WriterT</span>[<span style="color:#234a97">Id</span>, <span style="color:#234a97">W</span>, <span style="color:#234a97">A</span>]
</pre>
<p>
What is <i>WriterT</i>? It's a <i>monad transformer</i> and it's a topic for a different article but generally it can turn <i>"something else"</i> into Writer. This interesting functionality gives us a possibility to limit Writer occurrences in our code to minimum and use standard tuple (maybe... maybe for some reason you have such need or maybe you are just curious how to do it)
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> addToTuple<span style="color: #333333">(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span>value<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Dictionary</span> <span style="color: #333333">=></span> <span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">],</span><span style="color: #BB0066; font-weight: bold">Dictionary</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">{</span> dict <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">List</span><span style="color: #333333">(</span>s<span style="background-color: #fff0f0">"adding $key -> $value"</span><span style="color: #333333">),</span>dict <span style="color: #333333">+</span> <span style="color: #333333">(</span>key <span style="color: #333333">-></span> value<span style="color: #333333">))</span> <span style="color: #888888">// Standard Tuple</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> tupledResult<span style="color: #008800; font-weight: bold">=</span>initWriter
<span style="color: #333333">.</span>flatMapF<span style="color: #333333">(</span>addToTuple<span style="color: #333333">(</span><span style="background-color: #fff0f0">"one"</span><span style="color: #333333">,</span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">))</span>
<span style="color: #333333">.</span>flatMapF<span style="color: #333333">(</span>addToTuple<span style="color: #333333">(</span><span style="background-color: #fff0f0">"two"</span><span style="color: #333333">,</span><span style="color: #008800; font-weight: bold">true</span><span style="color: #333333">))</span>
<span style="color: #333333">.</span>mapWritten<span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">_</span> <span style="color: #333333">:+</span> <span style="background-color: #fff0f0">"additional log"</span><span style="color: #333333">)</span>
<span style="color: #333333">.</span>flatMapF<span style="color: #333333">(</span>addToTuple<span style="color: #333333">(</span><span style="background-color: #fff0f0">"three"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"value"</span><span style="color: #333333">))</span>
<span style="color: #008800; font-weight: bold">val</span> <span style="color: #333333">(</span>log<span style="color: #333333">,</span>r<span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span>tupledResult<span style="color: #333333">.</span>run
</pre></div>
<h1>What next?</h1>
<p class="akapit">
We can dive deeper into this WriterT and other Monad transformers to learn what is it or maybe we can focus on reading dataframe from filesystem into DataDictionary? There is also something else worth checking and quite intriguing - something called <i>Free Monads</i>...
</p>Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com0tag:blogger.com,1999:blog-1906126944535337964.post-21008781839742739642015-10-04T19:54:00.001+02:002017-06-17T13:15:32.978+02:00Spark Dataframes transformations with state Monad<p class="akapit">
I want to make this post an interesting journey into unique discovery how some super interesting functional concepts may solve very practical problems.
</p>
<p>
The problem is maybe not common but still practical (I need to find a synonym for "practical" - "real world" is ok?).
<ul>
<li><b>How to make chain of Dataframes transformations in Spark easier to test, easier to maintain and easier to evolve?</b> </li>
</ul>
I described it last time in : <a href="http://usethiscode.blogspot.com/2015/09/functional-data-transformation-with.html">my last post</a> where I also tried to show some attempts to solve it.
</p>
<p class="akapit">
So general idea is to chain pure functions into one transformation pipeline. Design based on small functions should ease testing and provide more chances to compose those small pieces of logic.
On the other hand there was no simple option to compose two transformations and also feeling of "reinventing the wheel" was difficult to shake off.
</p>
<p class="akapit">
To investigate a very promising mechanism from FP - <i>State Monad</i> we are going to see some conceptual examples with usage of ScalaZ. Then we will dive a little bit into mechanics shown in <a href="https://www.manning.com/books/functional-programming-in-scala">https://www.manning.com/books/functional-programming-in-scala</a>. And finally we will take a look at new library <a href="https://github.com/non/cats">https://github.com/non/cats</a> which in my current understanding suppose to be more modular version of ScalaZ somewhere in the future.
</p>
<h1>State Monad and Definition of Dictionary</h1>
<p>
Some worth reading theory :
<ul>
<li><a href="http://www.slideshare.net/dgalichet/scalaio-statemonad">http://www.slideshare.net/dgalichet/scalaio-statemonad</a> </li>
<li><a href="https://speakerdeck.com/mpilquist/scalaz-state-monad">https://speakerdeck.com/mpilquist/scalaz-state-monad</a> </li>
</ul>
</p>
<p class="akapit">
To recall from the last episode - we want to execute sequence of transformations of immutable DataDictionary which contains DataFrames and some configuration parameters.
Plus we want to log each step - you know - just in case...
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">DataDictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span>,<span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">]</span>
<span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">String</span>
</pre></div>
</p>
<p>
So if according to the theory - State Monad transforms a State into tuple (NewState,StateResult) --> <b>S => (S,A)</b> <br/>
Then in our case we will have signature <b>Dictionary => (Dictionary,Log)</b>
</p>
<p>
Take a look at this piece of code :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> pureTransformation<span style="color: #333333">(</span>key<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span> value<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">,</span> currentLog<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span>
<span style="color: #333333">(</span><span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">)</span> <span style="color: #333333">=></span> <span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Dictionary</span><span style="color: #333333">,</span> <span style="color: #BB0066; font-weight: bold">Log</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">{</span>
dict <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> newDict<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Dictionary</span> <span style="color: #333333">=</span> dict <span style="color: #333333">+</span> <span style="color: #333333">(</span>key <span style="color: #333333">-></span> value<span style="color: #333333">)</span>
<span style="color: #333333">(</span>newDict<span style="color: #333333">,</span> currentLog <span style="color: #333333">+</span> s<span style="background-color: #fff0f0">": added $key : $value"</span><span style="color: #333333">)</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
This is trivial method created for educational purposes. We are just adding a new entry to existing dictionary and then we are logging our operation.
Notice that we already have proper signature : <b>(Dictionary) => (Dictionary, Log)</b>
</p>
<p>
<i>State</i> companion object in ScalaZ has proper constructor where we need to put our transforming function :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> stateTransform<span style="color: #333333">(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span>value<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">,</span>currentLog<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">=</span><span style="background-color: #fff0f0">""</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span>
<span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Dictionary</span>,<span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]{</span>
pureTransformation<span style="color: #333333">(</span>key<span style="color: #333333">,</span> value<span style="color: #333333">,</span> currentLog<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> state<span style="color: #008800; font-weight: bold">=</span>stateTransform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key4"</span><span style="color: #333333">,</span><span style="color: #6600EE; font-weight: bold">4.0</span><span style="color: #333333">)</span>
<span style="color: #888888">//state: scalaz.State[Dictionary,Log] = scalaz.package$State....</span>
</pre></div>
</p>
<p>
And generally that's it :) Now it is important to understand that... nothing happened yet. We just have created a <i>description for transformation</i>. To run transformation itself we need to pass initial state of dictionary and trigger execution.
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> initialDictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">()</span>
state<span style="color: #333333">.</span>run<span style="color: #333333">(</span>initialDictionary<span style="color: #333333">)</span>
<span style="color: #888888">//res0: (Dictionary, Log) = (Map(key4 -> 4.0),: added key4 : 4.0)</span>
</pre></div>
</p>
<p>
That worked however now we need to solve our main problem - how to compose multiple transformations
</p>
<h1>Compose transformations</h1>
<p>
I don't want to go deep into for comprehension mechanics so if someone donesn't know what those arrows in scalas' <i>for</i> means - some tutorials can be found here --><a href="https://www.google.pl/search?q=scala+for+comprehension">https://www.google.pl/search?q=scala+for+comprehension</a>
</p>
<p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> transformation<span style="color: #008800; font-weight: bold">=for</span><span style="color: #333333">{</span>
log1<span style="color: #008800; font-weight: bold"><-</span>stateTransform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key4"</span><span style="color: #333333">,</span><span style="color: #6600EE; font-weight: bold">4.0</span><span style="color: #333333">)</span>
log2<span style="color: #008800; font-weight: bold"><-</span>stateTransform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key5"</span><span style="color: #333333">,</span><span style="color: #008800; font-weight: bold">true</span><span style="color: #333333">,</span>log1<span style="color: #333333">)</span>
finalLog<span style="color: #008800; font-weight: bold"><-</span>stateTransform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key6"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"someValue"</span><span style="color: #333333">,</span>log2<span style="color: #333333">)</span>
<span style="color: #333333">}</span> <span style="color: #008800; font-weight: bold">yield</span> finalLog
</pre></div>
</p>
<p>
Some awkwardness of using State Monad for this operation is that everything seems to be "log centered" where the essence is in dictionary manipulation which is handled by <i>State</i> internals
</p>
<p>
What is the type of <i>transformation</i> ?
<pre>
transformation: scalaz.IndexedStateT[scalaz.Id.Id,Dictionary,Dictionary,Log]
</pre>
We really don't want to go there so in this case it may be good Idea to precise type we want to work with by ourselves :
<pre>
val transformation : State[Dictionary,Log]=for{ ...
</pre>
</p>
<p>
And finally - in overview we are building something like this :
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhm1Lpbh3rUM2BOT1YWx0ALs97bO1a2vd4zu0B6ksUysDNo6pXY0OFKXmYxHZKIvitF7eN7uUlxcmmWjzOmjdllHWBJ2nwcbd5UPIE7l4nU-8V6afQDSnqGRRb7t_ORF3sVG4i9mCIa7rVh/s1600/statetransformation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhm1Lpbh3rUM2BOT1YWx0ALs97bO1a2vd4zu0B6ksUysDNo6pXY0OFKXmYxHZKIvitF7eN7uUlxcmmWjzOmjdllHWBJ2nwcbd5UPIE7l4nU-8V6afQDSnqGRRb7t_ORF3sVG4i9mCIa7rVh/s640/statetransformation.png"></a></div>
</p>
<p>
And usage of this educational example will result in :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Dictionary</span>
<span style="color: #008800; font-weight: bold">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">"key1"</span><span style="color: #333333">-></span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"key2"</span><span style="color: #333333">-></span><span style="color: #0000DD; font-weight: bold">2</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"key3"</span><span style="color: #333333">-></span><span style="background-color: #fff0f0">"value3"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> <span style="color: #333333">(</span>finalDictionary<span style="color: #333333">,</span>log<span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span>transformation<span style="color: #333333">.</span>run<span style="color: #333333">(</span>dictionary<span style="color: #333333">)</span>
finalDictionary
<span style="color: #888888">/*res2: Dictionary = Map(</span>
<span style="color: #888888">key4 -> 4.0, </span>
<span style="color: #888888">key5 -> true, </span>
<span style="color: #888888">key1 -> 1, </span>
<span style="color: #888888">key2 -> 2, </span>
<span style="color: #888888">key6 -> someValue, </span>
<span style="color: #888888">key3 -> value3)*/</span>
log
<span style="color: #888888">/*res3: Log = : </span>
<span style="color: #888888">added key4 : 4.0:</span>
<span style="color: #888888">added key5 : true:</span>
<span style="color: #888888">added key6 : someValue*/</span>
</pre></div>
</p>
<h1>Analysis in depth</h1>
<p>
According to what I saw till now ScalaZ has very elegant internal design with a lot of code reuse (and this "reuse concept" which was always OOP "holy grail") however to fully
understand State implementation you need to grasp wider picture.
</p>
<p>
So it may be easier to take a look at the code from <a href="https://www.manning.com/books/functional-programming-in-scala">https://www.manning.com/books/functional-programming-in-scala</a>.
You can find the whole example here : <a href="https://github.com/fpinscala/fpinscala/blob/master/answers/src/main/scala/fpinscala/state/State.scala">https://github.com/fpinscala/fpinscala/blob/master/answers/src/main/scala/fpinscala/state/State.scala</a> but generally we need to focus on two small methods :
</p>
<p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">case</span> <span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">S</span>, <span style="color: #333399; font-weight: bold">+A</span><span style="color: #333333">](</span>run<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">S</span> <span style="color: #333333">=></span> <span style="color: #333333">(</span>A<span style="color: #333333">,</span> S<span style="color: #333333">))</span> <span style="color: #333333">{</span>
<span style="color: #333333">(...)</span>
<span style="color: #008800; font-weight: bold">def</span> flatMap<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">B</span><span style="color: #333333">](</span>f<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">A</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">S</span>, <span style="color: #333399; font-weight: bold">B</span><span style="color: #333333">])</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">S</span>, <span style="color: #333399; font-weight: bold">B</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span>
<span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">(</span>s <span style="color: #008800; font-weight: bold">=></span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">val</span> <span style="color: #333333">(</span>a<span style="color: #333333">,</span> s1<span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span> run<span style="color: #333333">(</span>s<span style="color: #333333">)</span>
f<span style="color: #333333">(</span>a<span style="color: #333333">).</span>run<span style="color: #333333">(</span>s1<span style="color: #333333">)</span>
<span style="color: #333333">})</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
So it just chains sequence of states together into one function called <i>"run"</i> and executes it when you call <i>run</i> with an initial state.
I hope that the rest of the code is understandable - it always frustrate me when authors state that "the rest are just details" but in this case we are just invoking one function and passing result to another... the rest are just details :)
</p>
<h1>Spark Example</h1>
<p>
If whole concept is clear enough let's try to implement it on DataFrames.
You can find the whole example on : <a href="https://github.com/PawelWlodarski/blog/blob/master/src/main/scala/pl/pawelwlodarski/spark/functionalchain/ChainOnstateMonad.scala">https://github.com/PawelWlodarski/blog/blob/master/src/main/scala/pl/pawelwlodarski/spark/functionalchain/ChainOnstateMonad.scala</a>
</p>
<p>
We are going to simulate two transformations :
<ol>
<li>First one consist of three separate phases and just add couple columns to initial frame and then does simple join </li>
<li>Second one creates a business report by single select - for simplification this is "one phase transformation" </li>
</ol>
</p>
<h2>First Transformation </h2>
<p>
<pre style="background:#fff;color:#000"><span style="color:#ff7800">val</span> <span style="color:#3b5bb5">addTimeStamp</span>: <span style="color:#3b5bb5">DataFrame</span> => <span style="color:#3b5bb5">DataFrame</span> =
{ df =>
df.withColumn(<span style="color:#409b1c">"created"</span>,current_date())
}
<span style="color:#ff7800">val</span> <span style="color:#3b5bb5">addLabel</span>:<span style="color:#ff7800">String</span> => <span style="color:#3b5bb5">DataFrame</span> => <span style="color:#3b5bb5">DataFrame</span> =
{label => df =>
df.withColumn(<span style="color:#409b1c">"label"</span>,lit(label))
}
<span style="color:#ff7800">val</span> <span style="color:#3b5bb5">businessJoin</span> : (<span style="color:#ff7800">String</span>,<span style="color:#ff7800">String</span>) => (<span style="color:#3b5bb5">DataFrame</span>,<span style="color:#3b5bb5">DataFrame</span>) => <span style="color:#3b5bb5">DataFrame</span> =
{(column1,column2) => (df1,df2) =>
df1.join(df2, df1(column1) === df2(column2))
}
</pre>
</p>
<p>
A tried to justify this design in <a href="http://usethiscode.blogspot.com/2015/09/functional-data-transformation-with.html">Previous Post</a> . Generally here we have primitive simple stateless functions which are easy to test and compose.
</p>
<h2>lifting to Transformation </h2>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_0BQTnY9MyPoObuLx6RU2hoSXcqKfVDxxSjb0_d3l5Ujay6Pe1LhQzz1jN5Q92yfPZCMpQRCyKwM0aPr412MLbWTb2sboMT1rOBDwv0iE41R9HfS0z1qgoFVzS_9Ap1aE_IvIq1sdgHTL/s1600/lifting.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_0BQTnY9MyPoObuLx6RU2hoSXcqKfVDxxSjb0_d3l5Ujay6Pe1LhQzz1jN5Q92yfPZCMpQRCyKwM0aPr412MLbWTb2sboMT1rOBDwv0iE41R9HfS0z1qgoFVzS_9Ap1aE_IvIq1sdgHTL/s640/lifting.png"></a></div>
<p>
Now we need machinery to lift those small functions to transformation context :
</p>
<p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #007020">type</span> TransformationPhase<span style="color: #333333">=</span>DataDictionary <span style="color: #333333">=></span> (DataDictionary,Log)
trait Extractor[A]{
<span style="color: #008800; font-weight: bold">def</span> <span style="color: #0066BB; font-weight: bold">extract</span>(dictionary:DataDictionary)(key:String):A
}
implicit <span style="color: #007020">object</span> DataFramExtractor extends Extractor[DataFrame] {
override <span style="color: #008800; font-weight: bold">def</span> <span style="color: #0066BB; font-weight: bold">extract</span>(dictionary:DataDictionary)(key: String): DataFrame <span style="color: #333333">=</span>
dictionary(key)<span style="color: #333333">.</span>asInstanceOf[DataFrame]
}
<span style="color: #008800; font-weight: bold">def</span> <span style="color: #0066BB; font-weight: bold">liftToTransformation</span>[A:Extractor](f:A<span style="color: #333333">=></span>DataFrame)(key1:String)(resultKey:String)
:String <span style="color: #333333">=></span> TransformationPhase <span style="color: #333333">=</span>
{currentLog <span style="color: #333333">=></span> dictionary <span style="color: #333333">=></span>
val param1 <span style="color: #333333">=</span>implicitly[Extractor[A]]<span style="color: #333333">.</span>extract(dictionary)(key1)
val result<span style="color: #333333">=</span>f(param1)
val log<span style="color: #333333">=</span>currentLog <span style="color: #333333">+</span> s<span style="background-color: #fff0f0">"</span><span style="color: #666666; font-weight: bold; background-color: #fff0f0">\n</span><span style="background-color: #fff0f0">adding $resultKey -> $result"</span>
val newDictionary<span style="color: #333333">=</span>dictionary <span style="color: #333333">+</span> (resultKey <span style="color: #333333">-></span> result)
(newDictionary,log)
}
</pre></div>
</p>
<p class="akapit">
Couple important notes. First of all we have separated groups of parameters. First one is a primitive function the rest is about configuration. Maybe Lenses would be better for getting and setting parameters from dictionary but what we see is good enough for now. Extractors allow us to retrieve any type of value from dictionary (here only DataFrame for simplicity). And the last thing - as a result we need to compose logs and results - thats why lift become curried function <b>Log => DataDictionary => (DataDictionary,Log)</b>
</p>
<h2>Building transformations</h2>
<p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> addTimestampPhase<span style="color: #008800; font-weight: bold">=</span>
liftToTransformation<span style="color: #333333">(</span>addTimeStamp<span style="color: #333333">)(</span><span style="background-color: #fff0f0">"InitialFrame"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"WithTimeStamp"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> addLabelPhase<span style="color: #008800; font-weight: bold">=</span>
liftToTransformation<span style="color: #333333">(</span>addLabel<span style="color: #333333">(</span><span style="background-color: #fff0f0">"experiment"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"WithTimeStamp"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"Labelled"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> businessJoinPhase<span style="color: #008800; font-weight: bold">=</span>
liftToTransformation<span style="color: #333333">(</span>businessJoin<span style="color: #333333">(</span><span style="background-color: #fff0f0">"customerId"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"id"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"Labelled"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"SomeOtherFrame"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"JoinedByBusinessRules"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> transformation1<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Log</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataDictionary</span>,<span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span>
initialLog <span style="color: #008800; font-weight: bold">=></span> <span style="color: #008800; font-weight: bold">for</span><span style="color: #333333">{</span>
log1 <span style="color: #008800; font-weight: bold"><-</span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">(</span>addTimestampPhase<span style="color: #333333">(</span>initialLog<span style="color: #333333">))</span>
log2 <span style="color: #008800; font-weight: bold"><-</span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">(</span>addLabelPhase<span style="color: #333333">(</span>log1<span style="color: #333333">))</span>
log3 <span style="color: #008800; font-weight: bold"><-</span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">(</span>businessJoinPhase<span style="color: #333333">(</span>log2<span style="color: #333333">))</span>
<span style="color: #333333">}</span> <span style="color: #008800; font-weight: bold">yield</span> log3
</pre></div>
</p>
<p>
We build state for each transformation phase and then we composed everything into one transformation. It maybe more convenient to this part <i>State(phase)</i> into <i>lift</i> method.
</p>
<p>
And like it was mentioned at the beginning - another simple transformation :
</p>
<p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #888888">//transformation2</span>
<span style="color: #008800; font-weight: bold">val</span> importantSelect<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataFrame</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #008800; font-weight: bold">_</span><span style="color: #333333">.</span>select<span style="color: #333333">(</span><span style="background-color: #fff0f0">"customerId"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"credit"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"label"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"created"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> importantSelectPhase <span style="color: #008800; font-weight: bold">=</span>
liftToTransformation<span style="color: #333333">(</span>importantSelect<span style="color: #333333">)(</span><span style="background-color: #fff0f0">"JoinedByBusinessRules"</span><span style="color: #333333">)(</span><span style="background-color: #fff0f0">"BusinessReport"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> transformation2<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Log</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataDictionary</span>,<span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]</span><span style="color: #008800; font-weight: bold">=</span>
initialLog <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">(</span>importantSelectPhase<span style="color: #333333">(</span>initialLog<span style="color: #333333">))</span>
<span style="color: #008800; font-weight: bold">val</span> transformationComposed<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Log</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataDictionary</span>,<span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]</span><span style="color: #008800; font-weight: bold">=</span>
initialLog<span style="color: #008800; font-weight: bold">=>for</span><span style="color: #333333">{</span>
logT1 <span style="color: #008800; font-weight: bold"><-</span> transformation1<span style="color: #333333">(</span>initialLog<span style="color: #333333">)</span>
logT2 <span style="color: #008800; font-weight: bold"><-</span> transformation2<span style="color: #333333">(</span>logT1<span style="color: #333333">)</span>
<span style="color: #333333">}</span> <span style="color: #008800; font-weight: bold">yield</span> logT2
</pre></div>
</p>
<p>
You may noticed one interesting thisng - transformation is Actually not a State but a Function <b>Log => State[DataDictionary,Log]</b>. Reason for this is that it is only way I was able to pass log from one transformation to another to combine them. Without logs we could stay with simple State[DataDictionary,Log]
</p>
<h2>Composition! </h2>
<p>
And now the moment everyone was waiting for! Transformation composition!
</p>
<p>
<pre style="background:#fff;color:#000"><span style="color:#ff7800">val</span> <span style="color:#3b5bb5">transformationComposed</span>:<span style="color:#3b5bb5">Log</span> => <span style="color:#3b5bb5">State</span>[<span style="color:#3b5bb5">DataDictionary</span>,<span style="color:#3b5bb5">Log</span>]=
initialLog=><span style="color:#ff7800">for</span>{
logT1 <- transformation1(initialLog)
logT2 <- transformation2(logT1)
} <span style="color:#ff7800">yield</span> logT2
</pre>
</p>
<p>
Yes, this was it. This was "The Moment". Your life has just changed forever...
</p>
<h2>The Program</h2>
<p>
Let's check if what we created actually works.
</p>
<p>
<pre style="background:#fff;color:#000"> <span style="color:#ff7800">def</span> <span style="color:#3b5bb5">main</span>(args: Array[String]) {
val config<span style="color:#ff7800">=</span>new SparkConf()
.setMaster("local[4]")
.setAppName("Dataframes transformation with State Monad")
val sc<span style="color:#ff7800">=</span>new SparkContext(config)
val sqlContext<span style="color:#ff7800">=</span>new SQLContext(sc)
import sqlContext.implicits._
println("example start")
val df1<span style="color:#ff7800">=</span>sc.parallelize(Seq(
(<span style="color:#3b5bb5">1</span>,<span style="color:#409b1c">"cust1@gmail.com"</span>,<span style="color:#409b1c">"Stefan"</span>),
(<span style="color:#3b5bb5">2</span>,<span style="color:#409b1c">"cust2@gmail.com"</span>,<span style="color:#409b1c">"Zdzislawa"</span>),
(<span style="color:#3b5bb5">3</span>,<span style="color:#409b1c">"cust3@gmail.com"</span>,<span style="color:#409b1c">"Bonifacy"</span>),
(<span style="color:#3b5bb5">4</span>,<span style="color:#409b1c">"cust4@gmail.com"</span>,<span style="color:#409b1c">"Bozebozebozenka"</span>)
)).toDF(<span style="color:#409b1c">"customerId"</span>,<span style="color:#409b1c">"email"</span>,<span style="color:#409b1c">"name"</span>)
val df2<span style="color:#ff7800">=</span>sc.parallelize(Seq(
(<span style="color:#3b5bb5">1</span>,<span style="color:#3b5bb5">10</span>),
(<span style="color:#3b5bb5">2</span>,<span style="color:#3b5bb5">20</span>),
(<span style="color:#3b5bb5">3</span>,<span style="color:#3b5bb5">30</span>),
(<span style="color:#3b5bb5">4</span>,<span style="color:#3b5bb5">40</span>)
)).toDF(<span style="color:#409b1c">"id"</span>,<span style="color:#409b1c">"credit"</span>)
val dictionary:DataDictionary<span style="color:#ff7800">=</span>Map(<span style="color:#409b1c">"InitialFrame"</span> <span style="color:#ff7800">-</span><span style="color:#ff7800">></span> df1,<span style="color:#409b1c">"SomeOtherFrame"</span><span style="color:#ff7800">-</span><span style="color:#ff7800">></span>df2)
val (resultDictionary,log)=transformationComposed("").run(dictionary)
println("**************LOG*************** : "+log)
println("**************DICTIONARY********")
resultDictionary.foreach(println)
val result<span style="color:#ff7800">=</span>resultDictionary(<span style="color:#409b1c">"BusinessReport"</span>).asInstanceOf[DataFrame]
result.show()
}
</pre>
</p>
<p>
We have created two dataframes: first one represent some users second one users credits. Then we are building final report.
</p>
<h2>Results</h2>
<p>Log looks good - everything in there. </p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">**************LOG*************** :
adding WithTimeStamp -> <span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date<span style="color: #333333">]</span>
adding Labelled -> <span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string<span style="color: #333333">]</span>
adding JoinedByBusinessRules -> <span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string, id: int, credit: int<span style="color: #333333">]</span>
adding BusinessReport -> <span style="color: #333333">[</span>customerId: int, credit: int, label: string, created: date<span style="color: #333333">]</span>
</pre></div>
<p>Dictionary looks good - everything in there. </p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%">**************DICTIONARY********
<span style="color: #333333">(</span>JoinedByBusinessRules,<span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string, id: int, credit: int<span style="color: #333333">])</span>
<span style="color: #333333">(</span>WithTimeStamp,<span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date<span style="color: #333333">])</span>
<span style="color: #333333">(</span>BusinessReport,<span style="color: #333333">[</span>customerId: int, credit: int, label: string, created: date<span style="color: #333333">])</span>
<span style="color: #333333">(</span>InitialFrame,<span style="color: #333333">[</span>customerId: int, email: string, name: string<span style="color: #333333">])</span>
<span style="color: #333333">(</span>SomeOtherFrame,<span style="color: #333333">[</span>id: int, credit: int<span style="color: #333333">])</span>
<span style="color: #333333">(</span>Labelled,<span style="color: #333333">[</span>customerId: int, email: string, name: string, created: date, label: string<span style="color: #333333">])</span>
</pre></div>
<p>final report looks good - everything in there. </p>
<pre>
+----------+------+----------+----------+
|customerId|credit| label| created|
+----------+------+----------+----------+
| 1| 10|experiment|2015-10-04|
| 2| 20|experiment|2015-10-04|
| 3| 30|experiment|2015-10-04|
| 4| 40|experiment|2015-10-04|
+----------+------+----------+----------+
</pre>
<h1>Cats library</h1>
<p class="akapit">
My current experience tells me that when working with Spark it is good to have as few external dependencies as possible. Another library similar to ScalaZ is being created and what I understood it will be more modular. The library is called cats -> <a href="https://github.com/non/cats">https://github.com/non/cats</a>
</p>
<p>
You can similar educational example to the one which was at the beginning of this post --> <a href="https://github.com/PawelWlodarski/blog/blob/master/src/main/scala/pl/pawelwlodarski/spark/statemonad/CatsState.sc">Example In Cats</a>
</p>
<p>
For now it is difficult for me to say what is the difference between those two libraries because example looks similar :
</p>
<p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">cats.free.Trampoline</span>
<span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">cats.state._</span>
<span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">cats.std.all._</span>
<span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span>,<span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">]</span>
<span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">String</span>
<span style="color: #008800; font-weight: bold">val</span> dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">"key1"</span><span style="color: #333333">-></span><span style="color: #0000DD; font-weight: bold">1</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"key2"</span><span style="color: #333333">-></span><span style="color: #0000DD; font-weight: bold">2</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"key3"</span><span style="color: #333333">-></span><span style="background-color: #fff0f0">"value3"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">def</span> transform<span style="color: #333333">(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span>value<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">,</span>currentLog<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">=</span><span style="background-color: #fff0f0">""</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Dictionary</span>,<span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #BB0066; font-weight: bold">State</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Dictionary</span>,<span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]{</span> dict <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> newDict<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Dictionary</span><span style="color: #333333">=</span>dict <span style="color: #333333">+</span> <span style="color: #333333">(</span>key <span style="color: #333333">-></span> value<span style="color: #333333">)</span>
<span style="color: #333333">(</span>newDict<span style="color: #333333">,</span>currentLog <span style="color: #333333">+</span> s<span style="background-color: #fff0f0">": added $key : $value"</span><span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> state<span style="color: #008800; font-weight: bold">=</span>transform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key4"</span><span style="color: #333333">,</span><span style="color: #6600EE; font-weight: bold">4.0</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> toRun<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Trampoline</span><span style="color: #333333">[(</span><span style="color: #333399; font-weight: bold">Dictionary</span>, <span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">)]</span> <span style="color: #008800; font-weight: bold">=</span> state<span style="color: #333333">.</span>run<span style="color: #333333">(</span>dictionary<span style="color: #333333">)</span>
toRun<span style="color: #333333">.</span>run
<span style="color: #008800; font-weight: bold">val</span> state2<span style="color: #008800; font-weight: bold">=</span>transform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key5"</span><span style="color: #333333">,</span><span style="color: #008800; font-weight: bold">true</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> transformation<span style="color: #008800; font-weight: bold">=for</span><span style="color: #333333">{</span>
s1<span style="color: #008800; font-weight: bold"><-</span>transform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key4"</span><span style="color: #333333">,</span><span style="color: #6600EE; font-weight: bold">4.0</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">""</span><span style="color: #333333">)</span>
s2<span style="color: #008800; font-weight: bold"><-</span>transform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key5"</span><span style="color: #333333">,</span><span style="color: #6600EE; font-weight: bold">5.0</span><span style="color: #333333">,</span>s1<span style="color: #333333">)</span>
<span style="color: #333333">}</span> <span style="color: #008800; font-weight: bold">yield</span> s2
transformation<span style="color: #333333">.</span>run<span style="color: #333333">(</span>dictionary<span style="color: #333333">).</span>run
state<span style="color: #333333">.</span>flatMap<span style="color: #333333">(</span>log<span style="color: #008800; font-weight: bold">=></span>transform<span style="color: #333333">(</span><span style="background-color: #fff0f0">"key5"</span><span style="color: #333333">,</span><span style="color: #008800; font-weight: bold">true</span><span style="color: #333333">,</span>log<span style="color: #333333">)).</span>run<span style="color: #333333">(</span>dictionary<span style="color: #333333">).</span>run
</pre></div>
</p>
<h2>What next? </h2>
<ul>
<li>Can we make logging easier with WriterMonad? </li>
<li>Is it possible to use Lenses instead of simple Strings to operate on DataDictionary? </li>
<li>Scalaz has an interesting construct - <i>Monad.whileM</i> - can we use it to implement some kind of composable loops in our transformations?</li>
</ul>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFn2HsUSgCM1iBkRMYuYwe8BZscS7Z2oPHr6wXwk1icj5mWKWqMGZHbD8XPps_4bDrVO1HKos3x0NKsabRhQnmqdZnoTBqH6t__mvNzoEFxpI1w0c89bj1SoV57Tlg5EET38sT7lPd9FTV/s1600/zachod2.JPG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFn2HsUSgCM1iBkRMYuYwe8BZscS7Z2oPHr6wXwk1icj5mWKWqMGZHbD8XPps_4bDrVO1HKos3x0NKsabRhQnmqdZnoTBqH6t__mvNzoEFxpI1w0c89bj1SoV57Tlg5EET38sT7lPd9FTV/s400/zachod2.JPG"></a></div>Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com3tag:blogger.com,1999:blog-1906126944535337964.post-51203672584773016722015-09-20T23:05:00.000+02:002015-09-20T23:05:14.567+02:00Functional Data Transformation with Spark<p class="akapit">
You need to make some transformations with usage of Spark Dataframes which is easy to test and maintain? I hope that this article will provide you many useful ideas on how approach this problem.
</p>
<p>
This should be enough for an introduction - content start!
</p>
<h1>An abstract understanding of the problem</h1>
<p class="akapit">
Let's try to understand problem conceptually before we will dive into details. On the picture below you can see an example transformation flow when working with dataframes.
In Spark Dataframes are immutable which can save us from many strange problems. Moreover we can design our flow with usage of (properly granulated) small methods which simplifies testing a lot.
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQv2XJ2opLjXZzRZisTGh8z1125J4XfZCcjXqpASt4kLFfScPax7LDdTkvTnt-iDfn69qAArDLqWT9ShI8jRGlerg6sFxcn1N9DcRXEDy4dd3rujihrD7DsOVGsxDbNae6n5LDDnpf2nD2/s1600/dataframesflow.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQv2XJ2opLjXZzRZisTGh8z1125J4XfZCcjXqpASt4kLFfScPax7LDdTkvTnt-iDfn69qAArDLqWT9ShI8jRGlerg6sFxcn1N9DcRXEDy4dd3rujihrD7DsOVGsxDbNae6n5LDDnpf2nD2/s640/dataframesflow.png" /></a></div>
<p class="akapit">
But what if what we just saw is only a piece of a bigger system? Technically we have created a procedure which produces a result from 5 input parameters.
There is a great risk that with this design we are going to come across the same problems as with standard procedural programming
<ul>
<li>A set of tightly coupled methods which are difficult to use outside given procedure.</li>
<li>Some bags for methods like <i>"Utils classes"</i> with set of operations representing common logic which maybe reused "here and there" but most likely can not be easily composed together to produce a new behavior.</li>
</ul>
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEoV-63DHWx-j7qRM-ez34M0iFREbAc9H3qhc0EQqM72KsxI5FutrCrF028EH0yCuWut8RcVhg17X3V2mlLpwXDwXr8GxsXcaH_4eTG-Tuq8Supmaz67UwW2KdxJaXqfeyJfmGsS1ZeV3U/s1600/system.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEoV-63DHWx-j7qRM-ez34M0iFREbAc9H3qhc0EQqM72KsxI5FutrCrF028EH0yCuWut8RcVhg17X3V2mlLpwXDwXr8GxsXcaH_4eTG-Tuq8Supmaz67UwW2KdxJaXqfeyJfmGsS1ZeV3U/s640/system.png" /></a></div>
<p class="akapit">
According to my current knowledge design based on Functional Programming gives us a lot more chances to compose two pieces of a program and produce a new logic in elegant way. Also it is easier to test each piece of such design in separation in comparison to procedural approach. So let's move into FP world and see what is waiting for us there...
</p>
<h2>What we want to achieve - a view from top</h2>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEVciCluWVj3tu0S3Dd7oZACv-edd4qnVrnUZYsX9vXvtIVX63wdWgbb3MtqMXPJp26tiM9L57NjmloV3Bs8wkAOwAEV7SJFzVWUwy3YkuDUrSTiOvNCKn-H5l0aAFUOPqMRXl7cpseYYJ/s1600/functionalpipeline.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiEVciCluWVj3tu0S3Dd7oZACv-edd4qnVrnUZYsX9vXvtIVX63wdWgbb3MtqMXPJp26tiM9L57NjmloV3Bs8wkAOwAEV7SJFzVWUwy3YkuDUrSTiOvNCKn-H5l0aAFUOPqMRXl7cpseYYJ/s640/functionalpipeline.png" /></a></div>
<p class="akapit">
On this mysterious drawing we can see something which (let's hope) can be called "Functional Data Processing Pipeline". And those wonderful rectangles symbolizes various compositions of single functions, currying and whatever your brain is able to see there.
</p>
<p>
Till now we were moving on the level of very very abstract description. To gain better understanding let's dive now into code and see how build it from the bottom.
</p>
<h2>What we want to achieve - a view from bottom (and finally some code)</h2>
<p class="akapit">
Following examples may seem extraordinary simple but their goal of existence is to serve as an "educational example".
</p>
<p class="akapit">
In the first piece of presented code below we can see a transformation chain consisted of three functions. First one <i>generalTimestampFunction</i> may represent a very general logic common to many independent transformations in our system. <i>businessOperation</i> is a very unsophisticated select based on some business criteria. And finally we have some kind of report at the end.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> generalTimestampFunction<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333333">(</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">=></span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">{</span>df <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">org.apache.spark.sql.functions._</span>
df<span style="color: #333333">.</span>withColumn<span style="color: #333333">(</span><span style="background-color: #fff0f0">"created"</span><span style="color: #333333">,</span>current_date<span style="color: #333333">())</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> businessOperation<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">DataFrame</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">{</span> df <span style="color: #008800; font-weight: bold">=></span>
df<span style="color: #333333">.</span>select<span style="color: #333333">(</span><span style="background-color: #fff0f0">"column_with_business_meaning1"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"other_column1"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"other_column2"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"created"</span><span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> businessReport<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">DataFrame</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">{</span> df <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">org.apache.spark.sql.functions._</span>
df<span style="color: #333333">.</span>groupBy<span style="color: #333333">(</span><span style="background-color: #fff0f0">"created"</span><span style="color: #333333">).</span>agg<span style="color: #333333">(</span>count<span style="color: #333333">(</span><span style="background-color: #fff0f0">"column_with_business_meaning1"</span><span style="color: #333333">))</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> chain1 <span style="color: #008800; font-weight: bold">=</span>generalTimestampFunction andThen businessOperation andThen businessReport
</pre></div>
<p class="akapit">
Just to stay focused on our purpose in this exercise - we want to show how moving from procedural to functional design simplifies composition and testing - testing itself won't be covered in this article (I need material for further posts) but you may use your experience and imagination to see how easy to test those functions are. Ok, next example.
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> preconfiguredLogic <span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Long</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=</span> businessKey <span style="color: #008800; font-weight: bold">=></span> <span style="color: #333333">{</span>inputFrame <span style="color: #008800; font-weight: bold">=></span>
inputFrame<span style="color: #333333">.</span>select<span style="color: #333333">(</span><span style="background-color: #fff0f0">"column1,column2,created"</span><span style="color: #333333">)</span>
<span style="color: #333333">.</span>where<span style="color: #333333">(</span>inputFrame<span style="color: #333333">(</span><span style="background-color: #fff0f0">"businessRelation"</span><span style="color: #333333">)===</span> businessKey<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> <span style="color: #BB0066; font-weight: bold">CARS_INDUSTRY</span><span style="color: #008800; font-weight: bold">=</span><span style="color: #0000DD; font-weight: bold">1</span>
<span style="color: #008800; font-weight: bold">val</span> <span style="color: #BB0066; font-weight: bold">MOVIE_INDUSTRY</span><span style="color: #008800; font-weight: bold">=</span><span style="color: #0000DD; font-weight: bold">1</span>
<span style="color: #008800; font-weight: bold">val</span> chain2<span style="color: #008800; font-weight: bold">=</span>generalTimestampFunction andThen preconfiguredLogic<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">CARS_INDUSTRY</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> chain3<span style="color: #008800; font-weight: bold">=</span>generalTimestampFunction andThen preconfiguredLogic<span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">MOVIE_INDUSTRY</span><span style="color: #333333">)</span>
</pre></div>
<p>
Here we are using currying to inject some configuration into our function which is then composed with the rest of the chain. It fulfills my subjective feeling of good design - I hope your perception is at least a little bit similar :)
</p>
<p>
Till now we were operating on functions with one Dataframe as and argument. We can have situations where two or more Dataframes interact with each other to produce some meaningful result:
</p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> businessJoin<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333333">(</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">,</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #333333">={</span> <span style="color: #333333">(</span>frame1<span style="color: #333333">,</span>frame2<span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=></span>
frame1<span style="color: #333333">.</span>join<span style="color: #333333">(</span>frame2<span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">Seq</span><span style="color: #333333">(</span><span style="background-color: #fff0f0">"column_with_business_meaning1"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"column_with_business_meaning2"</span><span style="color: #333333">))</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> businessUnion<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333333">(</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">,</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #333333">={</span> <span style="color: #333333">(</span>frame1<span style="color: #333333">,</span>frame2<span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #888888">//some assertions that both frames are prepared for union</span>
frame1 unionAll frame2
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> businessOperation2<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">DataFrame</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">{</span> df <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">import</span> <span style="color: #0e84b5; font-weight: bold">org.apache.spark.sql.functions._</span>
df<span style="color: #333333">.</span>groupBy<span style="color: #333333">(</span><span style="background-color: #fff0f0">"column_with_business_meaning1"</span><span style="color: #333333">).</span>agg<span style="color: #333333">(</span>max<span style="color: #333333">(</span><span style="background-color: #fff0f0">"expected_column1"</span><span style="color: #333333">),</span>count<span style="color: #333333">(</span><span style="background-color: #fff0f0">"expected_column2"</span><span style="color: #333333">))</span>
<span style="color: #333333">}</span>
</pre></div>
<p class="akapit">
In this case we are going to use <i>Function2</i> and maybe later <i>Function3</i> or <i>Function4</i> where using <i>andThen</i> is not that easy so now is a very good moment to come back to our diagram with
abstract function chain transforming initial Data with several phases : <b>Data=>Data</b>
</p>
<h1>Data => Data and lifting to transformation </h1>
<p class="akapit">
Let's return to this picture once again :
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYWnWjNGGE1Hoajbbrl_UwK2WzyafNkfYbe7vkhMFfrQ7SN1YeCynxjiCWVKPVJKFLW-2KWivnR3ZuciQ2MfgxlgxR_hFLigBx5xj4ywN_vpRtF6r4TlPtYdv4Oar6hhi0OsDwHO_MG9zr/s1600/functionalpipeline.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhYWnWjNGGE1Hoajbbrl_UwK2WzyafNkfYbe7vkhMFfrQ7SN1YeCynxjiCWVKPVJKFLW-2KWivnR3ZuciQ2MfgxlgxR_hFLigBx5xj4ywN_vpRtF6r4TlPtYdv4Oar6hhi0OsDwHO_MG9zr/s640/functionalpipeline.png" /></a></div>
<p>
An <b>InitialData</b> in our case this may be a set of many Dataframes and some additional primitives like Long, String or whatever taken from the configuration.
Requirements are quite simple and standard Map should be enough to fulfill them
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Map</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span>,<span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">]</span>
</pre></div>
</p>
<p>
So our rectangles will now be a form of a function which creates another form of DataDictionary.
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Datadictionary</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">Datadictionary</span>
</pre></div>
</p>
<p>
And finally let's try to create a connection between just mentioned construct and pure functions which operate on "transformation primitives" (DataFrame, Long etc)
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> liftToTransformation<span style="color: #333333">(</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">DataFrame</span><span style="color: #333333">)(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span> <span style="color: #333333">{</span> dictionary <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> frame <span style="color: #008800; font-weight: bold">=</span>dictionary<span style="color: #333333">(</span>key<span style="color: #333333">).</span>asInstanceOf<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">]</span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span>f<span style="color: #333333">(</span>frame<span style="color: #333333">)</span>
<span style="color: #888888">//how to create a new dictionary???</span>
<span style="color: #333333">???</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
The First problem is that we don't know how to add produced result to the dictionary. We have two quick solutions and both of the produces result of following type :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">PhaseResult</span><span style="color: #333333">=(</span><span style="color: #BB0066; font-weight: bold">String</span><span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">DataFrame</span><span style="color: #333333">)</span>
</pre></div>
</p>
<p>
First one is to make primitive functions aware of a dictionary mapping :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> primitiveFunctionAware<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">DataFrame</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #333333">={</span>df <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span>df<span style="color: #333333">.</span>select<span style="color: #333333">(</span><span style="background-color: #fff0f0">"important_column"</span><span style="color: #333333">)</span>
<span style="color: #333333">(</span><span style="background-color: #fff0f0">"KEY"</span><span style="color: #333333">,</span>result<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
But if for some reason we want to have more elasticity we can just use currying to inject a key name :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> primitiveFunctionCurried<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">String</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #008800; font-weight: bold">=</span>key <span style="color: #008800; font-weight: bold">=></span> <span style="color: #333333">{</span>df <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span>df<span style="color: #333333">.</span>select<span style="color: #333333">(</span><span style="background-color: #fff0f0">"important_column"</span><span style="color: #333333">)</span>
<span style="color: #333333">(</span>key<span style="color: #333333">,</span>result<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
And now we can just create a new dictionary by adding primitive result to the old one
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> liftToTransformation<span style="color: #333333">(</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #333333">)(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span> <span style="color: #333333">{</span> dictionary <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> frame <span style="color: #008800; font-weight: bold">=</span>dictionary<span style="color: #333333">(</span>key<span style="color: #333333">).</span>asInstanceOf<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">]</span>
dictionary <span style="color: #333333">+</span> f<span style="color: #333333">(</span>frame<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
We are almost there :) Last thing in this phase - we want to use dictionary element of any type - not only Dataframes. To solve this let's create a mechanism responsible for extracting elements with particular types. In basic form it can be something like this :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">trait</span> <span style="color: #BB0066; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">]{</span>
<span style="color: #008800; font-weight: bold">def</span> extract<span style="color: #333333">(</span>dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">)(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">A</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">implicit</span> <span style="color: #008800; font-weight: bold">object</span> <span style="color: #BB0066; font-weight: bold">DataFramExtractor</span> <span style="color: #008800; font-weight: bold">extends</span> <span style="color: #BB0066; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">]</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> extract<span style="color: #333333">(</span>dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">)(</span>key<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">DataFrame</span> <span style="color: #333333">=</span>
dictionary<span style="color: #333333">(</span>key<span style="color: #333333">).</span>asInstanceOf<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">]</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">implicit</span> <span style="color: #008800; font-weight: bold">object</span> <span style="color: #BB0066; font-weight: bold">LongExtractor</span> <span style="color: #008800; font-weight: bold">extends</span> <span style="color: #BB0066; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Long</span><span style="color: #333333">]</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">override</span> <span style="color: #008800; font-weight: bold">def</span> extract<span style="color: #333333">(</span>dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">)(</span>key<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Long</span> <span style="color: #333333">=</span>
dictionary<span style="color: #333333">(</span>key<span style="color: #333333">).</span>asInstanceOf<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Long</span><span style="color: #333333">]</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
And now we can easily prepare a set of lifting methods which can lift our primitive functions into transformation signature <b>DataDictionary => DataDictionary</b>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">def</span> liftToTransformation<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A:Extractor</span><span style="color: #333333">](</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #333333">)(</span>key1<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span> <span style="color: #333333">{</span> dictionary <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> param1 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key1<span style="color: #333333">)</span>
dictionary <span style="color: #333333">+</span> f<span style="color: #333333">(</span>param1<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">def</span> liftToTransformation<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A:Extractor</span>,<span style="color: #333399; font-weight: bold">B:Extractor</span><span style="color: #333333">](</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333333">(</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">,</span><span style="color: #333399; font-weight: bold">B</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=></span><span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #333333">)(</span>key1<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span>key2<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span> <span style="color: #333333">{</span> dictionary <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> param1 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key1<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> param2 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">B</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key2<span style="color: #333333">)</span>
dictionary <span style="color: #333333">+</span> f<span style="color: #333333">(</span>param1<span style="color: #333333">,</span>param2<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">def</span> liftToTransformation<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A:Extractor</span>,<span style="color: #333399; font-weight: bold">B:Extractor</span>,<span style="color: #333399; font-weight: bold">C:Extractor</span><span style="color: #333333">](</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333333">(</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">,</span><span style="color: #333399; font-weight: bold">B</span><span style="color: #333333">,</span>C<span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=></span><span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #333333">)(</span>key1<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span>key2<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">,</span>key3<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span> <span style="color: #333333">{</span> dictionary <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> param1 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key1<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> param2 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">B</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key2<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> param3 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">C</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key3<span style="color: #333333">)</span>
dictionary <span style="color: #333333">+</span> f<span style="color: #333333">(</span>param1<span style="color: #333333">,</span>param2<span style="color: #333333">,</span>param3<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
The question is - can this be implemented as one method which can handle functions of various arity like Function1,Function2 or Function3? Currently I don't have answer to this question but I saw something similar in Scalaz or in spark itself with udfs - so maybe there is no other way?
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1g00hhXFtvtm9bFhT9FLMW7VUM-06sotvAVucjAVjYcVGcA-HbN9nVYawmhagYv9XVhJgCimUnxWK-Nq4jUTU9JGkeCgiiRzc3R4dMls8jIU4YNE7mToW1m7uJQCT9MZaeHmCqqqKVmkV/s1600/lifting.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh1g00hhXFtvtm9bFhT9FLMW7VUM-06sotvAVucjAVjYcVGcA-HbN9nVYawmhagYv9XVhJgCimUnxWK-Nq4jUTU9JGkeCgiiRzc3R4dMls8jIU4YNE7mToW1m7uJQCT9MZaeHmCqqqKVmkV/s640/lifting.png" /></a></div>
<p>
Last one code with example usage and we are done in this section of the article :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> primitiveFunctionCurried<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">String</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">DataFrame</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #008800; font-weight: bold">=</span>key <span style="color: #008800; font-weight: bold">=></span> <span style="color: #333333">{</span>df <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span>df<span style="color: #333333">.</span>select<span style="color: #333333">(</span><span style="background-color: #fff0f0">"important_column"</span><span style="color: #333333">)</span>
<span style="color: #333333">(</span>key<span style="color: #333333">,</span>result<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">val</span> functionLiftedToTransformation<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span><span style="color: #333333">=</span>
liftToTransformation<span style="color: #333333">(</span>primitiveFunctionCurried<span style="color: #333333">(</span><span style="background-color: #fff0f0">"RESULT_KEY"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"INPUT_FRAME"</span><span style="color: #333333">)</span>
</pre></div>
</p>
<h1>Transformation Chain</h1>
<p class="akapit">
We already handled level of primitive functions with all their advantages in context of composition. We know how to lift them to transformation signature. Now we need to build a "transformation chain" which will execute our transformation step by step :
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMMrolZvtcqGDBlDOuCth3MPDdoZ5g8UjBSENzyXPpDKTDCSTeJm8228G91OWF03aqhb-MbIU-JMvCdBeHlVf4FxEH5MKHYney_fa2aocIjZUa_OIlrGfyaNsH3kZGS4t3-CVIZSanRCzY/s1600/transformationchain.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMMrolZvtcqGDBlDOuCth3MPDdoZ5g8UjBSENzyXPpDKTDCSTeJm8228G91OWF03aqhb-MbIU-JMvCdBeHlVf4FxEH5MKHYney_fa2aocIjZUa_OIlrGfyaNsH3kZGS4t3-CVIZSanRCzY/s640/transformationchain.png" /></a></div>
</p>
<p>
We can construct a simple wrapper which can just act a little bit like "antThen" in our transformation :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">object</span> <span style="color: #BB0066; font-weight: bold">Transformation</span><span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">def</span> init<span style="color: #333333">(</span>loader<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span> <span style="color: #333333">=></span> <span style="color: #BB0066; font-weight: bold">Any</span><span style="color: #333333">)(</span>keys<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String*</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Transformation</span> <span style="color: #333333">=</span> <span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">val</span> inititalDictionary<span style="color: #008800; font-weight: bold">=</span>keys<span style="color: #333333">.</span>map<span style="color: #333333">(</span>key <span style="color: #008800; font-weight: bold">=></span> <span style="color: #333333">(</span>key<span style="color: #333333">,</span>loader<span style="color: #333333">(</span>key<span style="color: #333333">))).</span>toMap
<span style="color: #008800; font-weight: bold">new</span> <span style="color: #BB0066; font-weight: bold">Transformation</span><span style="color: #333333">(</span>inititalDictionary<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">Transformation</span><span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">val</span> dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">){</span>
<span style="color: #008800; font-weight: bold">def</span> transform<span style="color: #333333">(</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">Datadictionary</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #008800; font-weight: bold">new</span> <span style="color: #BB0066; font-weight: bold">Transformation</span><span style="color: #333333">(</span>f<span style="color: #333333">(</span>dictionary<span style="color: #333333">))</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
And Finally we can use our first version of transformation chain. In this form there are still some missing parts but generally more or less we were able to solve our initial problems : we have chained a set of standalone functions (only dependency is on transformation level where particular lifted function expect given frame to be in dictionary but the way it was inserted there is not important )
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> domainLoader<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">Any</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">???</span>
<span style="color: #008800; font-weight: bold">val</span> domainTransformation1<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span>
liftToTransformation<span style="color: #333333">(</span>primitiveFunctionCurried<span style="color: #333333">(</span><span style="background-color: #fff0f0">"RESULT_KEY1"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"FRAME1"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> domainTransformation2<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span>
liftToTransformation<span style="color: #333333">(</span>primitiveFunctionCurried<span style="color: #333333">(</span><span style="background-color: #fff0f0">"RESULT_KEY2"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"FRAME2"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> domainTransformation3<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">TransformationPhase</span> <span style="color: #333333">=</span>
liftToTransformation<span style="color: #333333">(</span>primitiveTwoParamFunction<span style="color: #333333">(</span><span style="background-color: #fff0f0">"FINAL_RESULT"</span><span style="color: #333333">))(</span><span style="background-color: #fff0f0">"RESULT_KEY"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"RESULT_KEY2"</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span><span style="color: #BB0066; font-weight: bold">Transformation</span>
<span style="color: #333333">.</span>init<span style="color: #333333">(</span>domainLoader<span style="color: #333333">)(</span><span style="background-color: #fff0f0">"FRAME1"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"FRAME2"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"CONFIG"</span><span style="color: #333333">)</span>
<span style="color: #333333">.</span>transform<span style="color: #333333">(</span>domainTransformation1<span style="color: #333333">)</span>
<span style="color: #333333">.</span>transform<span style="color: #333333">(</span>domainTransformation2<span style="color: #333333">)</span>
<span style="color: #333333">.</span>transform<span style="color: #333333">(</span>domainTransformation3<span style="color: #333333">)</span>
</pre></div>
</p>
<h2>Improvements - logging</h2>
<p>
With current design it is actually very easy to log parameters passed to each phase was run and the result which was calculated. It is simple because we need only to modify lift functions and make use of logs in transformation wrapper
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">Log</span> <span style="color: #333333">=</span> <span style="color: #BB0066; font-weight: bold">String</span>
<span style="color: #008800; font-weight: bold">type</span> <span style="color: #333399; font-weight: bold">LoggableTransformationPhase</span><span style="color: #333333">=</span><span style="color: #BB0066; font-weight: bold">Datadictionary</span> <span style="color: #008800; font-weight: bold">=></span> <span style="color: #333333">(</span><span style="color: #BB0066; font-weight: bold">Datadictionary</span><span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">Log</span><span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">def</span> liftToTransformationWithLogging<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A:Extractor</span><span style="color: #333333">](</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">PhaseResult</span><span style="color: #333333">)(</span>key1<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">:</span>
<span style="color: #333399; font-weight: bold">LoggableTransformationPhase</span><span style="color: #333333">=</span> <span style="color: #333333">{</span> dictionary <span style="color: #008800; font-weight: bold">=></span>
<span style="color: #008800; font-weight: bold">val</span> param1 <span style="color: #008800; font-weight: bold">=</span>implicitly<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Extractor</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">A</span><span style="color: #333333">]].</span>extract<span style="color: #333333">(</span>dictionary<span style="color: #333333">)(</span>key1<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span>f<span style="color: #333333">(</span>param1<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> log<span style="color: #008800; font-weight: bold">=</span>s<span style="background-color: #fff0f0">"transformed $param1 for $key1 into $result"</span>
<span style="color: #333333">(</span>dictionary <span style="color: #333333">+</span> result<span style="color: #333333">,</span>log<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">LoggableTransformation</span><span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">val</span> dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">,</span>
<span style="color: #008800; font-weight: bold">val</span> log<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Seq</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]</span><span style="color: #008800; font-weight: bold">=</span><span style="color: #BB0066; font-weight: bold">Seq</span><span style="color: #333333">.</span>empty<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]){</span>
<span style="color: #008800; font-weight: bold">def</span> transform<span style="color: #333333">(</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">=>(</span><span style="color: #BB0066; font-weight: bold">Datadictionary</span><span style="color: #333333">,</span><span style="color: #BB0066; font-weight: bold">Log</span><span style="color: #333333">),</span>transformationTitle<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span>
<span style="color: #333333">{</span>
<span style="color: #008800; font-weight: bold">val</span> <span style="color: #333333">(</span>nextDictionary<span style="color: #333333">,</span>transformationLog<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span>f<span style="color: #333333">(</span>dictionary<span style="color: #333333">)</span>
<span style="color: #008800; font-weight: bold">val</span> newLog<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Seq</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">Log</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span> log <span style="color: #333333">:+</span> transformationTitle <span style="color: #333333">:+</span> transformationLog
<span style="color: #008800; font-weight: bold">new</span> <span style="color: #BB0066; font-weight: bold">LoggableTransformation</span><span style="color: #333333">(</span>nextDictionary<span style="color: #333333">,</span>newLog<span style="color: #333333">)</span>
<span style="color: #333333">}</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<h2>Improvements - caching</h2>
<p>
We can also introduce caching operation on level of our abstraction. This step moves responsibility of caching from our functions to transformation chain which simplifies functions themselves even more. A naive implementation may look like this :
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">Transformation</span><span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">val</span> dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">){</span>
<span style="color: #008800; font-weight: bold">def</span> transform<span style="color: #333333">(</span>f<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Datadictionary</span><span style="color: #333333">=></span><span style="color: #BB0066; font-weight: bold">Datadictionary</span><span style="color: #333333">)</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #008800; font-weight: bold">new</span> <span style="color: #BB0066; font-weight: bold">Transformation</span><span style="color: #333333">(</span>f<span style="color: #333333">(</span>dictionary<span style="color: #333333">))</span>
<span style="color: #008800; font-weight: bold">def</span> cache<span style="color: #333333">(</span>keys<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String*</span><span style="color: #333333">)={</span>
keys<span style="color: #333333">.</span>foreach<span style="color: #333333">(</span>key <span style="color: #008800; font-weight: bold">=></span> dictionary<span style="color: #333333">(</span>key<span style="color: #333333">).</span>asInstanceOf<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">].</span>cache<span style="color: #333333">())</span>
<span style="color: #008800; font-weight: bold">this</span>
<span style="color: #333333">}</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<p>
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">val</span> result<span style="color: #008800; font-weight: bold">=</span><span style="color: #BB0066; font-weight: bold">Transformation</span>
<span style="color: #333333">.</span>init<span style="color: #333333">(</span>domainLoader<span style="color: #333333">)(</span><span style="background-color: #fff0f0">"FRAME1"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"FRAME2"</span><span style="color: #333333">,</span><span style="background-color: #fff0f0">"CONFIG"</span><span style="color: #333333">)</span>
<span style="color: #333333">.</span>transform<span style="color: #333333">(</span>domainTransformation1<span style="color: #333333">)</span>
<span style="color: #333333">.</span>cache<span style="color: #333333">(</span><span style="background-color: #fff0f0">"RESULT_KEY2"</span><span style="color: #333333">)</span>
<span style="color: #333333">.</span>transform<span style="color: #333333">(</span>domainTransformation2<span style="color: #333333">)</span>
<span style="color: #333333">.</span>transform<span style="color: #333333">(</span>domainTransformation3<span style="color: #333333">)</span>
</pre></div>
</p>
<p>
We can even implement some kind of statistics which can tell us if a frame was be requested couple times - so it may be worth to cache it!
<!-- HTML generated using hilite.me --><div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;"><pre style="margin: 0; line-height: 125%"><span style="color: #008800; font-weight: bold">class</span> <span style="color: #BB0066; font-weight: bold">DataDictionaryWithStatistics</span><span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">val</span> dictionary<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">Map</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span>,<span style="color: #333399; font-weight: bold">Any</span><span style="color: #333333">]){</span>
<span style="color: #008800; font-weight: bold">private</span> <span style="color: #008800; font-weight: bold">var</span> statistics<span style="color: #008800; font-weight: bold">:</span> <span style="color: #333399; font-weight: bold">Map</span><span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">String</span>,<span style="color: #333399; font-weight: bold">Int</span><span style="color: #333333">]</span> <span style="color: #008800; font-weight: bold">=</span> <span style="color: #333333">???</span> <span style="color: #888888">//how many times each dataframe was requested</span>
<span style="color: #008800; font-weight: bold">def</span> asDataFrame<span style="color: #333333">(</span>key<span style="color: #008800; font-weight: bold">:</span><span style="color: #333399; font-weight: bold">String</span><span style="color: #333333">)</span><span style="color: #008800; font-weight: bold">=</span>dictionary
<span style="color: #333333">.</span>get<span style="color: #333333">(</span>key<span style="color: #333333">)</span>
<span style="color: #333333">.</span>map<span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">_</span><span style="color: #333333">.</span>asInstanceOf<span style="color: #333333">[</span><span style="color: #333399; font-weight: bold">DataFrame</span><span style="color: #333333">])</span>
<span style="color: #333333">.</span>getOrElse<span style="color: #333333">(</span><span style="color: #008800; font-weight: bold">throw</span> <span style="color: #008800; font-weight: bold">new</span> <span style="color: #BB0066; font-weight: bold">RuntimeException</span><span style="color: #333333">(</span>s<span style="background-color: #fff0f0">"$key is not a Dataframe"</span><span style="color: #333333">))</span>
<span style="color: #333333">}</span>
</pre></div>
</p>
<h1>Reinventing a wheel?</h1>
<p class="akapit">
In this article we choose certain direction in solving our problem. We started from a problem definition then we build solution step-by-step and finally we saw what kind of tool do we need to implement the solution. In my opinion this is better than taking "something with name" and adapting solution to it - for example compare "how I can solve this with Writer Monad" vs "can I use Writer monad to solve this problem?"
</p>
<p>
Now we can check if there is something ready to use which can help us in building Dataframes transformation solution. We are not going to dive deeply into those tools because it is a good topic for next posts.
</p>
<h2>State Monad</h2>
<p>
Here is a good link with explanation --> <a href="http://www.slideshare.net/dgalichet/scalaio-statemonad">http://www.slideshare.net/dgalichet/scalaio-statemonad</a>. Conceptually a State Monad is something with signature <b>S=>(S,A)</b> so in our example <i>State would be DataDictionary</i>(or a mix of DD and Seq[Log]) and <i>A could be a Log</i> (or nothing if don't want to have logging - kamikaze mode!)
</p>
<p>
In this article we never considered situation when we need to somehow compose two full transformations. State monad is something more general created with composability in mind so there is a great potential for such operations.
</p>
<h2>Writer Monad</h2>
<p>
<ul>
<li><a href="https://wiki.haskell.org/All_About_Monads#The_Writer_monad">https://wiki.haskell.org/All_About_Monads#The_Writer_monad</a> </li>
<li><a href="http://eed3si9n.com/learning-scalaz/Writer.html#Adding+logging+to+program">http://eed3si9n.com/learning-scalaz/Writer.html#Adding+logging+to+program</a> </li>
<li><a href="http://stackoverflow.com/questions/23942890/is-the-writer-monad-effectively-the-same-as-the-state-monad">http://stackoverflow.com/questions/23942890/is-the-writer-monad-effectively-the-same-as-the-state-monad</a> </li>
</ul>
I've never used this one in practice but my first impression is that this one is just created to solve our problem! For sure I will go deeper into this topic.
</p>
<h1>Summary</h1>
<p class="akapit">
Please remember that code presented in this article is just a research project. Although it compiles - the main purpose of it was to present some concepts and some additional work is needed to adapt it to real world demands. If someone want's to take a look at the code : here is the repo --> <a href="https://github.com/PawelWlodarski/blog">https://github.com/PawelWlodarski/blog</a>.
</p>
<p>
And this is the end of this article and an exciting summary!
</p>
<div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnG8pBkxlqzKYBh9_SIlDc4glNg_7C2jnrnVOVVvvYtzaXOFNvkE0NZOw-X0ww8bh97A9S_xsRBXi_J8q-7CqSDibXo5SCeVa9qwm3C1Hxz9oPWtZL79B85ARDbkv7_dozfq25cyWtORl4/s1600/mountainview.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhnG8pBkxlqzKYBh9_SIlDc4glNg_7C2jnrnVOVVvvYtzaXOFNvkE0NZOw-X0ww8bh97A9S_xsRBXi_J8q-7CqSDibXo5SCeVa9qwm3C1Hxz9oPWtZL79B85ARDbkv7_dozfq25cyWtORl4/s400/mountainview.jpg" /></a></div>Paweł Włodarskihttp://www.blogger.com/profile/04891037231290616803noreply@blogger.com2