Roland’s notes: Robert C. Martin – Clean Code
Like many others, I learn best by taking notes and I more remember taking the notes then the actual learning (this is why it is such a dealbreaker to send out meeting notes after an important meeting!).
In this series, let’s call it Roland’s notes, for now I will share my notes about books and articles that I find important enough to take notes myself. These are more meant for me and the purpose is not really to have them at the end, the goal is the journey itself to write them. What could be a better starter, than one of the must-haves of software engineering?
General impressions
It’s easy to see why this book is considered a must-read in the industry – one of my mentors once said: “Look, I could tell you this, but someone already put it to a perfect form and I don’t think I could explain it better. Just go read it” which I found accurate after.
It is the type of book I think I will have to come back to again later once I gain more coding experience – some concepts were too abstract for me for the first run. However, it helped me understand what clean code means, and it can act as an anchor if next time I find myself debating what would be a better way of building a method.
Notes
Meaningful names
Naming cheat-sheet
I have a mark here that this is important. My first thought was: my god, I should just memorize this whole chapter – it is collected so well, what would I make notes of? However, I realized that I would really like to have a “Clean Code naming cheat sheet”, so figured I can write that.
- intention-revealing names
- it should do what you think from the name it does
- make it pronounceable
- avoid names like genymdhms (generation date, year, month, day, hour minute, second).
- just use English words, like generationTimestamp
- Use searchable names
- max_classer_per_student is better than “7”
- avoid using one-char variable names
- if your method is very small, it could be justified – but try to avoid it
- don’t use prefix or suffix
- people learn to ignore them. Once ignored, it just makes code harder to read
- if you feel like that you NEED to use prefixes, it is typically a strong indicator you could outsource it to a separate class.
- If you have an interface and an implementation, and you feel you have to mark it, mark the impl. not the interface
- interface: shapeFactory, implementation: shapeFactoryImp
- Classes are nouns
- ignore the words: manager, processor, data, info
- methods names
- verb or verb phrase names
- accessor, mutator should get, set, is
- don’t be cute
- use kill() for kill and not whack(). Say what you mean!
- one word per concept
- if you already have fetched, use fetch everywhere. Consistency is key!
- Use computer-science abstract terms
- these will be known to other coders you work with. Also helps with keeping abstraction higher.
- otherwise, you will need to have business knowledge to understand code which is not preferred
Functions
Factories
This was not directly in the book, but was referred to, and I didn’t have the clear picture in my head what a factory is and what the benefit of one is. In the book, it implements an EmployeeFactory.
Now I understand, that the purpose of the factory is a design pattern that provides an interface for creating objects in a superclass, but allows the superclass to define the type of object it would create. It is easier to understand with an example:
We have two classes, Dog and Cat:
public class Cat implements Animal {
public void speak() {
System.out.println("Meow");
}
}
public class Dog implements Animal {
public void speak() {
System.out.println("Woof");
}
}
Now, when I want to create either of them I need to define which class I exactly want and fulfill all constructors (for simplicity, they don’t have parameters now). To create an abstraction layer, we define a superclass, Animal:
public interface Animal {
void speak();
}
Now, I know that they are familiar, and I can interact with them, and this gives me an option to create a factory:
public class AnimalFactory {
public Animal createAnimal(String type) {
switch (type) {
case "cat":
return new Cat();
case "dog":
return new Dog();
default:
throw new IllegalArgumentException("Invalid animal type: " + type);
}
}
}
Now, the benefit we get from it is that I don’t need to know the exact class and all their needs, I only need to deal with AnimalFactory, and it will create my animal for me and will return the required one (hence the name factory). The other benefit is that if I need a new type of animal, I only have to update AnimalFactory (the abstraction), but will not need to update the client code. A great showcase for open-closed principle!
Objects and Data Structures
Output arguments
When I wrote my first application, I had a case where I had to do sequential filtering with mutual exclusivity (whatever I filtered out I know will not be needed later and can be removed). I figured the most efficient way of running the sequence was that whatever I found, I also removed it from the list. I remember when the senior developer reviewed my code, he said, “Don’t you touch my inputs!” and I took a mental note: never ever change any input parameter.
Now, life is not as black and white and I think that removing wasn’t a bad option from a performance point of view, but there wasn’t enough explanation around it. My conclusion is, that if I have to break such a general rule, then next time I should be more clean with my code (maybe move the list to a class and have some method remove the object from it?).
Error handling
Error.java
When I built my first pet project in Spring, I had a paid mentor who shared some practices he would recommend. One of them was for error handling, he recommends creating an enum, which contains all the possible codes. Now, Clean code recommends against it (the argument is that it is a dependency magnet and would force all dependent classes to be recompiled and redeployed upon change)
There is no conclusion. I still like the idea of having the enum as it forces the users to have more standardized error codes. Maybe having functional layering and error codes enum for that element could be a good middle ground? For this, I need to get more experience to make up my mind.
Classes
How do you write clean code?
You don’t! At least not for the first try. First, write a code that works. The key is to not stop there, then rethink and refactor until you don’t think it could be further improved. The whole thesis of the book is that this time investment will pay off thousands of times during maintenance.
Don’t return null
One uncaught null can cause a runtime error – it is better to avoid it. Practical advice is to use Collections.emptyList when dealing with lists. Sometimes you will have to – but if you can, avoid it.
Learning Tests Are Better Than Free
Let’s imagine you have to learn how a new library or 3rd party API works. The recommended way is to start to create test cases: you anyway have to understand it, but these tests will be useful later because you can run them every time the library/API is upgraded to make sure it is still working as expected.
How big a class should be?
The book recommends the Single Responsibility Practice, which recommends that every class should have one responsibility and one reason to change. The responsibility is easy to understand and even easier to ignore – but what do we mean by change?
To me inexperienced mind, it was not obvious what we mean by change. Is it a change in state? Or is a change of a variable inside? Is it a change in code when a function changes? It turns out, that it is the last one: we have to think what functionality change or future event could cause us to change the class. If there is more than one, it is a good indicator to change.
An easy example would be the following class:
public class Report {
public String generateData() {
// Code to generate report data
}
public String formatData() {
// Code to format report data
}
}
Now, assume that the code is small enough it would make sense that we store these in one class: the responsibility is “everything with the records”. This is not really true (for first, it is a poor choice of responsibility), but for the reports, both the way it is generated and formatted are separate reasons to change: hence it would be recommended to split.
Emergence
The main four rules
- Runs all the tests
- If you can’t verify correct behavior, it doesn’t matter if it behaves correctly.
- Writing tests pushes you to keep smaller, more readable, more testable classes and methods.
- Contains no duplication
- Easy to spot: same lines of code repeating
- Hard to spot: different code doing same thing
- Use abstract classes
- Expresses the intent of the programmer
- Keep functions, methods, classes small → each level of abstraction gives you an option to name something with an expressive name
- Use standard
- design patterns that other developers will understand
- names and nomenclature
- Unit tests are also expressive.
- Minimizes the number of classes and methods
- The word minimizes is important. It should be the goal, but don’t overdo it.
Concurrency
A common design pattern is producer-consumer. A producer produces a resource and moves it to a pool where the consumer will pick up. The producer should not produce if there is no space in the pool, and the consumer must wait until there is something to consume.
Best practices & “Smells and Heuristics”
Vertical separation
Ideally, classes should have one public method and they should be using private methods. Private methods should be just below when they are being first identified. Code will be written (by humans) mostly from top to bottom, so you should make your code readable that way.