Parse, don’t validate in Kotlin

Image by Devanath

This article has been inspired by the awesome work of Alexis King’s “Parse, don’t validate”. All credits go to her and I will strongly recommend reading it even if you are unfamiliar with Haskell.
Here I will try to illustrate the problem in a more concise way and how we can apply those concepts in Kotlin.

What’s the context?

Almost any useful application needs inputs from an external source. It doesn’t matter what that source is, it could be data coming from an HTTP request, a gRPC call or a more simpler input provided through a terminal.

Given that we do not have control over the quality of the input, we should never trust it and always perform the due validation and sanitation to avoid our programs from crashing abruptly or worse, let some attacker exploit them.

Input validation is a fundamental part of any given software, yet I have the feeling is one that is often an afterthought.

What’s the problem with validation?

Even though the following problem is true for every programs, I will focus on Backend Web Services given that is the one I’m most familiar with.

In the JVM world it is pretty common to first deserialize a raw request into something more useful for our program. This part is usually delegated to powerful libraries like kotlinx.serialization.
Once we have a better representation of our Request, the next step is to validate that it actually satisfies all the constraints imposed by our business logic.

Let me use an example to better clarify this concept.
We want to implement a customer sign-up endpoint and the request will be transferred in JSON format:

{
"email": string,
"password": string,
"passwordVerify": string
}

Our web-server will then use kotlinx.serialization and deserialize the above request into a proper Kotlin data class:

import kotlinx.serialization.Serializable@Serializable
data class SignupRequest(
val email: String,
val password: String,
val passwordVerify: String
)

The deserializer will ensure that all properties are present and have the right type. Our next step is to validate SignupRequest accordingly to our business logic:

  • email must be a non-empty string
  • email must be a valid e-mail
  • password must be strong enough
  • passwordVerify must be the same as password

Validation is where things gets interesting. We have many way of doing it and most often, in JVM world, when an argument doesn’t satisfy a constraint an IllegalArgumentException will be thrown and the request aborted.

In my opinion throwing an Exception due to validation is quite wrong. The whole point of validation is understanding if data satisfy all our requirements, hence the fact that it doesn’t satisfy them isn’t an exceptional case. Precise information about the failure is indeed important given that we want to return a meaningful error to our Frontend so that our customers can understand why the request failed.

Our response should be like:

{
"errors": {
"email": "invalid-email",
"password": "too-weak",
"passwordVerify": "no-match"
}
}

If we use Exceptions, validation short-circuit at the first error and it can get quite ugly if we then want to return all errors.

For these reasons, validation should at least return a list of errors and each error should capture at least the reason and the field:

data class ValidationError(field: String, reason: String)

Kotlin already has nice libraries that will allow you to validate your classes in a type-safe manner, for example Konform.

Suppose now that we implemented a function to validate our SignupRequest:

fun validate(req: SignupRequest): List<ValidationError>

Problem solved! Let’s move on the happy path, the actual creation of our customer entity.

The natural fit for it would be to have a CustomerService class that will take care of it. We could be tempted on passing the whole SignupRequest:

class CustomerService {
fun signup(req: SignupRequest): Unit
}

This version has a couple of problems:

  • CustomerService is only interested in email and password but this type signature force callers to also provide passwordVerify. This will make implementing another signup flow (eg: an import) cumbersome.
  • We could pass a SignupRequest that does not satisfy any of the previously mentioned requirements. We could then say that CustomerService should be the one performing the validation, but experience shows that this is often a very bad decision. For example, a Public API flow could require a different set of validation than the Frontend one.

We could argue that we will always validate before calling the signup method, but again practice teaches us that our memory is fallacious and even when equipped with the best intentions, we cannot keep much in our mind, thus we will inevitably forget to call this or that method at some point, especially in large and complex systems.

I believe that knowing our own fallacies is really important and helps us improve ourselves; because we understood that we are not good at keeping track of things, we created computers and compilers that can do that for us and in a much more efficient way.

This should let us realize that validation alone isn’t the right tool for the job: it doesn’t give us any real guarantee and doesn’t help the Kotlin compiler help ourselves.

Parsing to the rescue

What Alexis suggested in her article is that we should keep validation as close to the source as possible, but then all downstream components must require a proof that data are actually correct.

What this means is that we should change our CustomerService to be like:

class CustomerService {
fun signup(email: Email, password: Password): Unit
}

Now it explicitly requires an Email and a Password, but what’s better is that the Kotlin compiler will force us to pass the right type.

This also makes evident that we can’t just pass values coming from our SignupRequest given that those are simple, unstructured Strings, hence we are forced to parse the request:

data class Signup(email: Email, password: Password)fun parseSignup(req: SignupRequest): Parsed<Signup>

For the same reason as for validation, we don’t want to use exceptions, therefore our parse function will return a Parsed<Signup> . This follows the so called Result pattern, also known as Either. It is much simpler than it sounds, in fact the result would be one of two possible objects:

  • An Ok<Signup> in case the request is valid
  • A ParseError in case some errors were detected while parsing

To use the value wrapped by Parsed we are forced by the compiler to acknowledge the failure case, so it will help us do the right thing ✌️

Our Endpoint will then look like:

class SignupEndpoint(val customerService: CustomerService) {
fun invoke(req: SignupRequest): Response =
when (val parsed = parse(req)) {
is ParseError ->
invalidRequest(parsed.errors)
is Ok ->
this.customerService.signup(
parsed.value.email,
parsed.value.password
)
successfulResponse()
}
}

By using the correct visibility modifiers, the compiler can also guarantee that our Endpoint must use parseSignup if it wants to get Email and Password objects out of the SignupRequest, which implies that values provided to signup method will always be correct.

Conclusion

Validation is very important for all of our programs, but just validating is not enough, we must refine our types so that our compiler can keep track of all the business requirements and help us do the right thing.

Even though the ecosystem is great when it comes to byte streams parsing (which is just another term for “deserialization”), it lacks any proper library for general-purpose parsing, ie: parsing a less structured object into a more structured one.

That’s why I’ve decided to give back to Kotlin community by open-sourcing a library that crystallize the essence of “Parse, don’t validate”: Parsix

Hope you enjoyed it,
See you next time! 👋