Haskell Adventures: Functors

Part 1 | Part 2

Software design becomes easier when it’s possible to replace parts of the program: for instance, user might want to load the same report in the CSV or XLS format. In this case we need to have a shared part that prepares the data, and the replaceable one for the formatting. In the object–oriented world we have classes with methods that can be polymorphic: they have the same API but work differently under the hood. Usually such classes might have a common base class/implement interface (e.g., C#, Java) or just have a method with the same signature (e.g., Ruby, Python). How can we do it in the Haskell where we do not have classes in the traditional sense?

Types are instances of classes. Wait, what?

Believe me or not, but in Haskell this problem is solved using classes 👹 However, Haskell classes have nothing common with OOP ones: class, or type class, defines a list of functions that should be implemented in the type that belongs to the class. As soon as we know that we work with a type that implements a class, we can start using all the functions defined in its class type.

You might notice that type classes are very similar to OOP interfaces

Time to get our hands dirty! Imagine a situation that you have a value that might be blank, let’s try to present it using types:

data MMaybe a = MJust a | MNothing deriving (Show)

MJust 1 -- MJust 1

Welcome the type constructor, that can embed the value of any other type (i.e., a) and gives it a context: it’s easy to understand if value is present, or not. As a result, our value goes to the some kind of box (this analogy is not fully correct but we can go with that for now).

Let’s try to compare two values:

MJust 1 == MJust 2

-- <interactive>:5:1: error:
--     • No instance for (Eq (MMaybe Integer)) arising from a use of ‘==’
--     • In the expression: MJust 1 == MJust 2
--       In an equation for ‘it’: it = MJust 1 == MJust 2

Boom! No instance for (Eq (MMaybe Integer)) means that our type should be an instance of a class Eq. No more words, I do want to compare my boxes!

instance (Eq m) => Eq (MMaybe m) where
  MJust x == MJust y = x == y
  MNothing == MNothing = True
  _ == _ = False

MJust 1 == MJust 2 -- True

There are three possible scenarios: MNothing is equal to MNothing, MNothing and MJust can never be equal, MJust is equal to MJust only when their values are equal. In order to compare values we do x == y, but how Haskell knows these two values can be compared? This is why we have (Eq m) =>: it tells compiler that (MMaybe m) can only implement Eq when m implements Eq. Let’s try MMaybe with something incomparable:

data Incomparable = Something | SomethingElse

MJust Something == MJust SomethingElse

-- <interactive>:8:1: error:
--     • No instance for (Eq (MMaybe Incomparable))
--         arising from a use of ‘==’
--     • In the expression: MJust Something == MJust SomethingElse
--       In an equation for ‘it’:
--           it = MJust Something == MJust SomethingElse

Failed as expected! By the way, did you notice little deriving Show at the beginning? Show is also class type that teaches Haskell how to print values, while deriving asks Haskell to use the default implementation.

– Mom, can we have the implementation of Show?
– We have the implementation at home!
At home: deriving

Using Maybe and understanding Functor

Why did I call my type MMaybe? Because the problem we solve is so common that Maybe is a part of a standard library! Let’s try it out: we’re going to write business logic, like we do in our daily work 🙂 There is a Jira card assigned to us: we need to read a user’s email from the database and format it before printing.

type UserID = Int
type Email = String

getEmail :: UserID -> Maybe Email
getEmail 42 = Just "john.doe@example.com"
getEmail _ = Nothing

formatEmail :: Maybe Email -> Maybe Email
formatEmail maybeEmail
  | Just email <- maybeEmail = Just $ "Email: " ++ email
  | otherwise = Nothing

formatEmail $ getEmail 42 -- Just "Email: john.doe@example.com"

First of all, we define two new types—UserId and Email to make our signatures more readable (otherwise our function signature was getEmail :: Int -> Maybe String). Technically they are just aliases to built–in types.

The getEmail function imitates database call. Since our startup is still small we have only one user. Moreover, previous 41 users deleted their data. 🤷‍♂️ Notice that we switched to the Maybe type from the standard library (it’s called Prelude). Our homemade implementation was kicked out of the house.

The formatEmail function accepts the Maybe Email type and performs the formatting: if it’s not Nothing it “unboxes” the value (Just email <- maybeEmail), adds the "Email: " prefix and packs it back to Maybe.

You might notice a new thing in the syntax: there is $ in the Just $ "Email: " ++ email, what is it for? It helps us to change or order of execution without using braces. If we remove $ from this expression, it will be executed as (Just "Email: ") ++ email, which will obviously fail. In opposite, $ will make sure that the right part will be executed earlier (i.e., as it was Just ("Email: " ++ email)).

The pattern when we take value out of the “box”, change it and put back sounds like something we might want to use often. Please welcome our new friend—type class Functor!:

data MMaybe a = MJust a | MNothing deriving (Show)

instance Functor MMaybe where
  fmap f (MJust x) = MJust (f x)
  fmap f MNothing = MNothing

type UserID = Int
type Email = String

getEmail :: UserID -> MMaybe Email
getEmail 42 = MJust "john.doe@example.com"
getEmail _ = MNothing

formatEmail :: MMaybe Email -> MMaybe Email
formatEmail = fmap (\email -> "Email: " ++ email)

formatEmail $ getEmail 42 -- MJust "Email: john.doe@example.com"

In order to prepare this example we had to bring back our own implementation of Maybe we kicked off earlier. The operation we look for is called fmap: it accepts a function and a “box”; if possible, it takes value from the box, applies the function and puts the value back to the box. This is how we can implement it for the MMaybe:

instance Functor MMaybe where
   fmap f (MJust x) = MJust (f x)
   fmap f MNothing = MNothing

Now we can change formatEmail to use fmap and get rid of the pattern matching:

formatEmail :: MMaybe Email -> MMaybe Email
formatEmail = fmap (\email -> "Email: " ++ email)

fmap is used very often, so there is even an infix form for it—<$> (please note that this snippet was switched back to the built–in Maybe which is already an instance of Functor):

type UserID = Int
type Email = String

getEmail :: UserID -> Maybe Email
getEmail 42 = Just "john.doe@example.com"
getEmail _ = Nothing

formatEmail :: Maybe Email -> Maybe Email
formatEmail maybeEmail = ("Email: " ++) <$> maybeEmail

formatEmail $ getEmail 42 -- Just "Email: john.doe@example.com"
formatEmail $ getEmail 4 -- Nothing

By the way, a list type (e.g., [1, 2, 3]) is an instance of Functor too. How do you think fmap is implemented for lists? Right, it’s just a map you’ve seen thousands of times before 🙂

Functors for functions

In the previous example we used fmap with the function that accepts only one argument: ++, which is used to concatenate lists, accepts two arguments, but we partially applied one argument using currying (("Email: " ++)) so it became a function of one argument.

If you forgot what currying is—check out the first part of the article

What if we pass a function that accepts two arguments (e.g., ++) to the fmap? As we know, fmap applies the function to the value in the box. We also know, that when we call a multi–parameter function with a single argument, it returns a new function without the first argument. Combine these two facts and you’ll get the right answer: fmap will take the value from the box, partially apply it to the function, and put that curried function back to the box! Congratulations, you just discovered applicative functors.

There is a special type class called Applicative that adds a support of applicative functors to types. In order to make things work, we need to implement two functions: pure and <*>.

data MMaybe a = MJust a | MNothing deriving (Show)

instance Functor MMaybe where
   fmap f (MJust x) = MJust (f x)
   fmap f MNothing = MNothing

instance Applicative MMaybe where
  pure = MJust
  MNothing <*> _ = MNothing
  (MJust f) <*> something = fmap f something

The pure function accepts a value and puts it to the box. For instance, in case of MMaybe it wraps the value with MJust and calls it a day; if it was a list—it would create a list with a single element.

The <*> is a little bit tricky: it has an applicative functor on the left and some value on the right. It tries to apply the function from the applicative functor to the value when it’s possible, otherwise, when there is the MNothing on the left—returns MNothing.

You might wonder why one might want to use it, so here is an example from our do much business application:

type UserID = Int
type Email = String

getEmail :: UserID -> MMaybe Email
getEmail 42 = MJust "john.doe@example.com"
getEmail _ = MNothing

formatEmail :: MMaybe Email -> MMaybe Email
formatEmail maybeEmail = (++) <$> maybeEmail <*> MJust "Email: "

formatEmail $ getEmail 42 -- MJust "Email: john.doe@example.com"
formatEmail $ getEmail 4 -- MNothing

Take a closer look at the updated formatEmail function:

formatEmail :: MMaybe Email -> MMaybe Email
formatEmail maybeEmail = (++) <$> maybeEmail <*> MJust "Email: "

First of all we build an applicative functor using (++) <$> maybeEmail, there are two scenarios possible:

if maybeEmail contains a value then ++ will be applied to that value (e.g., (++) <$> MJust "value" will become MJust ("value ++));
if maybeEmail is MNothing then it will just stay MNothing.

After that <*> is used to apply functor to the value. Again, there are two scenarios possible:

if there is an MNothing on the left then it will just stay MNothing;
if there is an applicative functor on the left then the function inside it will be applied to the value inside MJust (e.g. MJust ("value" ++) <*> MJust "!" becomes MJust ("value" ++ "!") and MJust ("value!")).

Why so complex? The main benefit here is that we only care about the “good” scenario when email exists, because when MNothing appears in the middle of the chain—it’s just traversed to the top of the calculation.

ISBN validation problem

If you read my previous article about Haskell you might notice that I’m bad at examples, so I steal them from Codewars. Things did not change, and here comes another one!

Imagine that you need to validate an ISBN number and here are the rules:

number has exactly 10 symbols;
every position except the last one can contain only digits;
last position can be either digit or “X” (which means 10);
given that positions start with 1, a sum of positions multiplied by the corresponding digit should be dividable by 11 without a reminder.

Let me illustrate the last part: 048665088X is valid because (0*1+4*2+8*3+6*4+6*5+5*6+0*7+8*8+8*9+10*10) / 11 equals to 32.

Right to the solution:

import Data.List
import Data.Char ( digitToInt, isDigit )
import Data.Maybe

validISBN10 :: String -> Bool
validISBN10 num
  | length num /= 10 = False
  | Nothing <- sumDigitsMod11 = False
  | Just value <- sumDigitsMod11 = value == 0
  where
    sumDigitsMod11 = (`mod` 11) . sum <$> sequence components
    components = [(*) <$> pure pos <*> getDigit char pos | (pos, char) <- zip [1..] num]

getDigit :: Char -> Int -> Maybe Int
getDigit 'X' 10 = pure 10
getDigit 'X' _ = Nothing
getDigit d _
  | isDigit d = pure $ digitToInt d
  | otherwise = Nothing

validISBN10 "048665088X" -- True
validISBN10 "048665188X" -- False

As before, let’s read from the bottom to the top. A function getDigit accepts a character and its position and returns Maybe Int. Why Maybe? Well, it would be helpful to stop calculations when we see something invalid. First of all, we try to parse X on the 10 position:

getDigit 'X' 10 = pure 10

Then, we make sure that X does not appear somewhere else:

getDigit 'X' _ = Nothing

Finally, we try to parse a digit and return it, otherwise—return Nothing:

getDigit d _
  | isDigit d = pure $ digitToInt d
  | otherwise = Nothing

Before we get to that huge function that solves our problem, let’s come up with the plan: we need to take all the chars along with their positions, parse them, sum and make sure that the result can be divided by 11. Since we operate functors we can stop thinking about bad scenarios and just return Nothing.

Let’s start with generating pairs of positions and parsed digits:

components = [(*) <$> pure pos <*> getDigit char pos | (pos, char) <- zip [1..] num]

You might guess that [1..] is the endless list. zip takes two lists and generates a list of pairs, the length of the list will match the length of the shortest one. Given that num equals to "048665088X", zip [1..] num will produce [(1,'0'),(2,'4'),(3,'8'),(4,'6'),(5,'6'),(6,'5'),(7,'0'),(8,'8'),(9,'8'),(10,'X')].

We use a list generation syntax to get a list of positions multiplied by digits:

(*) <$> pure pos <*> getDigit char pos

(*) <$> pure pos takes a position pos (a first element of the pair), uses pure to put it to the Maybe (we could write Just pos, but we want be fancy) and makes it an applicative functor. For instance, if our pos equals 2, the result of (*) <$> pure pos will be Just (* 2). After that, we apply this functor to the result of the getDigit call, which can be either Just or Nothing. As a result, the whole expression will be either Just Int or Nothing, and components will contain the array with multiplication results.

Huh, next line:

sumDigitsMod11 = (`mod` 11) . sum <$> sequence components

Let’s start with sequence: this function takes a list of “boxes”, takes all values, puts them into the list and wraps the list with the box again. As usual, a single Nothing element makes the whole thing Nothing:

sequence [Just 1, Just 2, Just 3] -- Just [1,2,3]
sequence [Just 1, Nothing, Just 3] -- Nothing

(`mod` 11) . sum is a composition of two functions: the result function will accept some list, sum values in it and call mod on it.

By the way, . is a function too! It has an interesting type (b -> c) -> (a -> b) -> a -> c, which means that when it gets two functions it combines them into a new one. Also, since . is a function, we can use a prefix form: (.) (`mod` 11) sum [1,2,3]. Not sure why you might need it, just wanted to let you know 🙂

Finally, we use <$> to perform the operation on our Maybe value on the right. When it’s Just—we sum numbers and get the reminder. Now we only need to check our conditions:

validISBN10 :: String -> Bool
validISBN10 num
  | length num /= 10 = False
  | Nothing <- sumDigitsMod11 = False
  | Just value <- sumDigitsMod11 = value == 0
  where
    sumDigitsMod11 = (`mod` 11) . sum <$> sequence components
    components = [(*) <$> pure pos <*> getDigit char pos | (pos, char) <- zip [1..] num]

We need to check three things:

length num /= 10 = False checks the expected length of the input;
Nothing <- sumDigitsMod11 = False handles the case when something wrong happened during parsing (i.e., Nothing appeared somewhere);
Just value <- sumDigitsMod11 = value == 0 checks that reminder is zero.

🎉 Problem is solved!

That’s all for today! Let’s recap what we learned:

Haskell has type classes to define list of functions and, optionally, default implementations for them;
types can be instances of classes, which means that have all these functions implemented;
types can be polymorphic using type classses;
types can be “boxes” for the values they hold, box gives value a “context”;
Functor is a class that can change value in the box;
Applicative Functors represent functions inside boxes and their usage along with regular values.

Reminding that I have no production experience in Haskell, so things I write might be inaccurate or even completely wrong. Bear with me 🐻