Do Programmers Need to Understand the Underlying Platform?

This topic came up for discussion recently. How much of the underlying technology (such as network, server, operating system, etc) does a programmer need to know in order to write good programs? A colleague held the belief that, ideally, the underlying technologies can or should be abstracted so that programmers only need to know how to program. Another person believed it is necessary for programmers to understand the platform they develop for. Abstraction is useful and important, but I, too, believe that ultimately the programmer had jolly well better have a good understanding of the platform he is developing on.

The original question was about whether a programmer needs to understand the underlying hardware and operating system. But we soon generalized the issue to include underlying platform technologies that could include network, databases, filesystems, and other layers below the application.

I think we should all be able to agree that good knowledge of hardware and operating systems is important, because they can influence many aspects of an application. For example, do you prefer to use multithreading instead of forking multiple processes? In Computer Science theory, the general idea is that threads are “light”, so multithreading is usually more efficient. But this is not so in some operating systems. Linux is one of them. Threads are as heavy as processes in Linux. This is why even when the Apache httpd web server started supporting multithreading, the best choice on Linux platforms was to continue to use the old pre-forking multiprocessing model.

Magic Layer of Abstraction

The colleague contended that, ideally, this should be abstracted so that programmers don’t need to know this. Then my question: is it possible for an abstraction layer to magically resolve multithreading vs multi-processes choices like the above? Maybe, multithreading and multi-processes are not terribly different, so, let’s take this further. You know there are some purpose-designed very high-performance web server programs that are single-threaded, single-process, and very event-driven with non-blocking-I/O? Yes, lighttpd is one example. Can you have an abstraction layer that can transform program logic from one model to another, depending on the performance attributes of the underlying operating system?

Let’s look at Javascript. Suppose there could be such an abstraction layer, then let’s imagine that to be the Javascript engine itself. Programmers write Javascript and let the Javascript engine deal with the underlying characteristics of the operating system and hardware. Is it possible for the Javascript engine to parallelize your Javascript code across multiple processor cores? The answer is no, at least for today. Is it possible to happen some day in future?

My take is that you have to rewrite your algorithm to take advantage of parallelism. You could not make a Javascript engine automagically do that for you.

Persistent Data Storage

Now, consider databases. Databases can be seen as a data storage abstraction. You can read and write data to it. Can you simply treat the database as a magic blackbox and trust it to do what’s best for you, regardless of what you throw at it?

Absolutely not. Careful indexing will help improve read performance. But if you think, then, why don’t we just index everything… well, excessive indexing will penalize write performance. So, you need to understand how best to design your indices.

Some people may say that index design is really part and parcel of using a database. But I’m sure there will be programmers who just want a persistent data store, and turned to using databases without really wanting to knowing the nitty gritty details of optimal database indexing.

Next, how about choosing the right storage engine? MySQL offers a variety of table storage types. Some types serve certain purposes better than others. Some types lock at the table level, others lock at the row level. You don’t really need to know this to use the database. But I think you would write a better program (or in this case, designed a better database schema) had you understood these underlying details.

Network Matters

Communicating on the network can also become a complicated thing. A program that emits small bits of information and expects an acknowledgement after each transmission might perform well on a fast low-latency network. But if you’re communicating over slower or longer distance (i.e. high-latency) networks, you will hit a performance bottleneck.

The TCP layer in TCP/IP tries to automagically solve this for you with something called Nagle’s Algorithm. Originally, each TCP segment transmitted needed to be acknowledged. Nagle’s Algorithm can automatically batch up several small blocks so that the whole batch can be sent at one go. Sounds really nice, since we can avoid the ding-dong of acknowledgements between every data packet.

But Nagle’s Algorithm can badly interfere with certain types of communication protocols. It doesn’t play nice with another TCP feature called TCP Delayed Acknowledgements. It can play havoc on real-time systems that expect data transmissions to be sent instantly. So, you see, Nagle’s Algorithm is both good and bad at the same time. It depends on what you’re trying to do.

If Nagle’s Algorithm is too low-level for your comfort, consider how the network can impact high-level application design issues too. For example, consider how an app may make an AJAX call upon a user action like a click. Suppose you design an interactive web game that can generate many quick rapid-fire clicks. Do you think it is a very good idea to have so many rapid AJAX calls? Your users are probably going to click much faster than the network can deliver AJAX responses from previous calls if network latency is high. If you had not thought about this carefully, it’s possible that your web app might not behave deterministically when it has so many AJAX calls “in flight”.

Performance and Scalability

Many of things I’m talking about has got to do about performance and scalability. You could write a working program without knowing any of those complexities above. Writing a working program is easy. Really easy. But you will run into performance and scalability bottlenecks if you do not have the deeper understanding of what you’re programming on or for.

Consider my above example of rapid-fire AJAX calls. The load on the web server may not be tremendous when you have a small handful of users, but think about what happens when you scale up to hundreds or thousands of simultaneous online users. Would your server hold up to serve several thousands of requests per second?

Sidetrack: It often happens that programmers write programs that work perfectly on their development PC. The programs will probably also work great in QAT. But the programs absolutely fail when they have gone into production. It is when apps have gone into production that many programmers begin to grasp the reality of traffic volume, disk bottleneck, lock contention, etc.

Programmer vs System Analyst

The job of properly designing the application system is actually that of the System Analyst. According to Wikipedia…

A systems analyst researches problems, plans solutions, recommends software and systems, and coordinates development to meet business or other requirements.

So actually, what I have been asking from the programmer is in the job scope of the System Analyst. Ah, so maybe that’s why there is a distinction between the programmer and the System Analyst. The programmer is really a peon, he or she writes code according to specifications that have been handed down.

However, I still think the programmer needs to understand the underlying technology, and I have several reasons:

The System Analyst is often times too high-level, and might possibly leave low-level implementation details to the programmer.
Depending on your workplace culture, it’s possible that applications start off as ideas and proof-as-concept developed by programmers. Even when eventually the idea takes off and people recognize that various things need to be re-designed, it’s likely that the original programmers are going to be the lead driver.
The line between System Analyst and programmer can be quite blurry. They could often be the same people!

I’m not asking for the programmer to be an expert in all trades. What is needed is for the programmer to have a good grasp of all trades, be an expert in several of them, recognize his or her limitation in terms of knowledge and experiences, and know where or who to tap on to fill in those gaps.

Understand the Underlying Platform!

My position is that, yes, good programmers will have to understand the underlying platform and technologies they develop for. This will differentiate the really good programmers from those who merely happen to know how to write programs.

This is particularly important for people who want to consider themselves “IT Professionals”. Otherwise, how would you distinguish IT-trained professionals from those who learnt programming in their spare time? (In fact, people who learn programming on their own time are already pretty much as good as IT-trained professionals anyway.)

I know people who like to think or dream a lot about abstraction. They don’t produce much work. There are others who work a lot, but they aren’t necessarily productive. Then, there are those stellar techies who seem to be able to poke their noses everywhere… they are the ones who produce stunning results.

Writing a working program is easy. Writing a program that works successfully in production is the real challenge.