Read Code

0

Category : ,

If you woke up one day resolved to be a great writer, you’d hear two simple pieces of feedback: write a lot, and read a lot.

In software, nearly everyone falls short on the latter. Early in your career, embrace reading code. Read widely and read often.

Why read code?

Great writers are a function of the writers they’ve read. Similarly, seeing diverse coding practices lets you expand your own palette when it comes time to write your own code. It exposes you to new language functionality, different coding styles, and important dependencies.

Reading your dependencies will make you a more productive programmer. You’ll know what the full functionality your dependencies offer. You’ll know exactly how they work and what tradeoffs they’re making. You’ll know where to debug when things go wrong.

Being comfortable reading code reduces the likelihood that you’ll have a “not invented here” mindset. The more you read, the more comfortable you’ll feel depending or augmenting someone else’s code, rather than building your own.

Cultivating regular reading will teach you tools and products widely different from your day to day work.

As a web engineer, reading a small part of a raytracer’s codebase will expose you to a wholly different set of constraints and user needs.

The more you read, the less scary it’ll be. Developing a strategy to read code then becomes a virtuous cycle.

How to “read” code?

Approaching a codebase like you would a book— reading from start to end—is a recipe for failure. Any medium-sized codebase is impossible to read linearly. Instead, “reading” code requires a substantial amount of active work.

I use this four-party strategy for approaching any complex code base (RSDW for short):

  1) Run: compile, run, and understand what the code is supposed to do

  2) Structure: Learn high level structure and review key integration tests

  3) Deep Dives: Follow key flows and read important data structures

  4) Write: Write tests and prioritize simple features and bug fixes

When you’re starting to read, be humble. A common mistake for early engineers is to weigh their code as “good” and that of others as “bad.” Instead, have some empathy for the styles of others and the uneven process that leads to the birth of most new software systems.

As a new developer, when you notice missing things, especially documentation and poor test coverage, help out. This is an exceptional way to learn, and will endear you to the others who develop the codebase, whose support will speed up your journey to learn a new codebase.

Second, reading new code is exhausting. You are intricately retracing code flows and trying to hold tens of new data structures and functions in your head concurrently. Be aggressive about taking breaks when you’re approaching a new code base. When I’m starting on a new complex codebase, a few good hours of reading is all I need to feel productive for the rest of the day.

With the right frame of mind, it’s much easier to go through the active process of digesting new code.

1) Run

The first step in reading code, isn’t to read code. It is to use the software.

Do not read code without understanding exactly what the software does and what functionality it offers. During this stage of reading, you should be able to make a summary of the code and have an understanding what the inputs and outputs could be.

Using the software forces you to get it to run. This means compiling the code (in some languages) and pulling down the dependencies. This is also the time to run the tests and review the output messages. If you have trouble getting the system running, this is the perfect time to document what it actually takes to compile and run the software.

2) Structure

Next, identify the most critical parts of the code. This is the part that is most different from reading a book. Instead of beginning at the beginning, you identify the key nexuses in the code.

Start with understand the structure of the code. At minimal, use tree | less and cloc to figure out the languages and files of the codebase.

To identify the most important files, look at the most modified files (git log --pretty=format: -name-only | sort | uniq -c | sort -rg | head -10[2]) and use other advanced tools. Review the most important integration tests, listing out the functions that are called. Flag these tests for later.

There’s a cheat code for this process too: find someone who’s worked on the code before. understanding the structure is a good first task for a whiteboarding session.

3) Deep Dives

Once you have a lay of the land, dig in.

Programming languages revolve around two fundamental paradigms: functional (when actions are primary) and objects (when objects are primary). Similarly, when reading code, you should look at code flows (seeing the actions that are being created) and look at data structures (where the results of actions are stored).

Pick 3-5 few critical flows you’ve found from key integration tests or your review of the source files. Then dive deeper. Start at the top of a specific action and trace the code through. Some developers swear by debuggers that let you step through. Others prefer building UML diagrams or flamegraphs. If you decide to manually follow the code, make sure your editors is setup to let you use “go to definition” and “find all references” for quick navigation.

For data structures, review the data types and when key variables are being set. Use the debugger to interrogate these data structures at critical moments.

I also keep two markdown docs open for these deep dives. The first is a “level up my coding” doc where I list out new syntax I’m seeing and code patterns I find interesting for my own learning (others call this a glossary). This allows me to return for further investigation. The second is a doc that lists out key questions I have for the developers of the codebase. At this stage, I also add to the documentation when I notice gaps.

Deep dives are especially powerful in pairs with someone who knows the code. If I have limited time with a developer on the project, I always have them trace me through a few key flows.

4) Write Code

Unlike passive reading in literature, a critical part of “reading” code is writing code. The two easy ways to “read” is to write tests and address simple features/ bugs.

Writing tests is an active form of reading, forcing you to pay attention to the actual inputs and outputs of a particular interaction.

Writing tests is when you realize that you’re still missing important details. Writing tests imprints the code in your memory in a way that reading alone cannot. Unit tests are an easy way to start, and once I have some base mastery, I move over to integration tests that force me to understand increasingly larger parts of the codebase.

The other easy way to write early code is to write simple features or address easy bugs. Both these tasks don’t demand complete knowledge of the codebase, but force you to confront the code. They also provide quick wins that increases your confidence and motivation.

What code to read?

Early in your career, 60% of your time should be spent reading code. Maybe half of that should be code outside of the direct codebases you actually build on top of. That’s an awfully large amount of time to fill, so what should you read?

The easiest way to get started reading and the highest ROI is to learn your dependencies. Internalizing how your dependencies work lets you more easily debug and reason across your entire system.

The other high ROI path is to pick an important system at your company that you interface with, and read through it. Not only will this be valuable in your work, but professional codebases are different from open source codebases. They are written closest to how your team’s engineers feel is the “right” way.

Beyond the direct systems you interact with, cultivate an openness to reading widely. Early on in my career, I recommend putting aside an hour in the morning or evening to read through code outside your day to day work.

To start, pick a few easily understood codebases. Redis is known as a popular starting point in C. Famed codebases — say reading Vim — are more complicated with lots of nuance, but an easy way to start is to read a specific subsystem.

Try to actively read tools widely different from your day job. If you’re used to high level abstractions, learn an abstraction level (or three) down. If you work in one language, pick another language to read in your free time.

Find coders you respect or want to mimic and follow them on Github. Read a few of their other codebases. Stay up to date of their most recent work.

Create a reading group, a code club. In Stockholm, I heard great things about “The Classical Code Reading Group of Stockholm,” where they read classic codebases (I used to join at an affiliate run by Thoughtbot in NYC).

When you first start reading code, uour focus is not simply to learn a codebase, but to develop a mindset that will pay long term dividends.

Links

Articles on reading code

  Ask HN: How do you familiarize yourself with a new codebase

  Ask HN: How to understand the large codebase of an open source project?

  Strategies to quickly become productive in an unfamiliar codebase

  What's the best way to become familiar with a large codebase?

  Tips for Reading Code

  Software Engineering Radio: Software Archaeology with Dave Thomas (Podcast)

Books

  Code Reading: The Open Source Perspective

(book)

  Working Effectively with Legacy Code (book)

Codebases to read

  Good Python codebases to read

  Good Go codebases to read



Source

this article is taken from a post on a forum which can be found here
https://news.ycombinator.com/item?id=19431874 https://hackernoon.com/one-secret-to-becoming-a-great-software-engineer-read-code-467e31f243b0