Over the past week, I put together a primer on how computers work. It's designed for someone with basic computer literacy, but not a whole lot more. The whole thing is long, but hopefully written at a level that should be accessible.
When most people look at a computer, they see a piece of technological magic. There's not a better way to put it than that. You press a key or move a mouse and magically something on the screen changes. If you have programing experience, you might be able to talk about the code that causes it to change, but I've found lots of programmers treat their work as a set of incantations and don't really understand what happens at the lower level. Hopefully, this will provide a glimpse into the mysteries of the operation of a computer, from silicon chips to high level code. It won't magically make you an expert, but it will act as a base if you ever want to know more.
At its core, a computer is made up of millions of devices known as transistors. These are created by a number of chemical and photographic processes on silicon wafers. You can think of it as developing a picture on a thin piece of glass, except instead of ink UV lasers and semiconductors act as pigments. You might have heard terms like "22 nanometer architecture". They're referring to the minimum width of a feature that can be etched into silicon. The number is significant because smaller architectures translate to more transistors on a chip and less power user per transistor. To give you an idea of how small 22 nm is, the most powerful visible light microscope possible could not resolve anything smaller than 390 nm simply because the wavelength of the light is too big. You need an electron microscope to look at it.
A transistor operates like a switch. You can open it(turn off) or close it(turn on) by applying a voltage to a control point. Depending on the design, the transistor may turn off or on based on the voltage applied to the control. That's a simplification, but for purposes of a computer you only need to think of a transistor as a switch. Here is the symbol for a transistor:
The wire labeled B is called the Base. It is the control point. C is the collector and E is the Emitter. All you need to know is that they are where electricity flows when the transistor is on. The arrow points either towards or away from the emitter, depending on the type of transistor. Again, it's not important why for your general understanding.
As you can see below, when it has power applied to it, the transistor is on and current flows through it:
When there is no input voltage to its control, current no longer flows. It behaves the same as a break in the wire:
But how do you go from a switch to a computer? You can't play Tetris with just light switches, right? Well, it all comes down to what are known as logic gates. (Also, someone sort of did make a Tetris playing computer out of the equivalent to switches)
Logic gates are standard arrangements of transistors that perform a logical operation. For example, a logical AND would be True if both inputs are True and False in all other cases. We can represent True as a closed switch and False as an open one. Now, if you put two switches in series - that is the output of one is hooked to the input of the next - you could only get current to pass through if both were turned on. That's exactly the same as a logical AND. Below, if both A and B are on, you get current flow. Otherwise, at least one will block the current:
The same applies to an OR, except you want it to be True if any input is True. From the perspective of switches, you put them in parallel. If two of them are connected to the same input and the same output, as long as either of their control signals is True, it will act the same way as a single closed switch. It's a little hard to see, but either A or B could be turned on and there would be a path through them:
With different combinations of gates, you can perform any logical operation you would care to define. But how do you go from logical operations to something that can carry data or instructions? Well, the answer lies in binary. Each binary digit - known as a bit - can be either one or zero. In the same way, a logic gate can be either True or False. So we assign True to be equal to the binary one and False to be zero. Likewise, you can make larger numbers by putting bits side by side. One bit can only represent 2 numbers (0 and 1), but with every bit you add you multiply the number of values by two. It's like how each digit you add to our usual base-10 numbers multiplies the maximum value by ten. So with two bits, you get four values, three bits gives eight values, and so on. When you have eight bits you can go to two-hundred-fifty-six numbers, or two to the eighth power. This is known as a byte and is a basic unit of information. A byte can be a number, part of an instruction to a computer, a single text character (i.e. ASCII), part of a pixel in an image, or any type of communication you care to name. Below is an ASCII table, and shows how all basic English characters can be mapped to numbers from 0-127. That is exactly two-hundred-fifty-six numbers, and as I previously mentioned, one byte contains two-hundred-fifty-six possible values. So one character of plain text can fit in one byte of memory.
Going back to logic gates, it can be proven that given a set of gates, you can perform any arithmetic operation. Basically, any calculator's function can be performed by a set of logic gates. Computers actually have arrangements of gates called Arithmetic Logic Units or ALUs that are specifically designed to do math. But they can do other things. Certain arrangements of gates can be used to store bits of data. Whatever the logical value at their input is when they receive a signal is the one they will output until told otherwise. Gang billions of these up together, and you can store any information you want. These are the building blocks for certain types of RAM.
You can also arrange logic gates to send very fast repeating signals. If they are turning on and off in a regular pattern, you could say they imitate the tick-tock of a clock. In fact, that's what they are used for; a reference time that is used by the system. That may not seem important, but a clock is what drives a computer. It takes time for a series of logic gates to operate, and during that time the value they output may change. How can a processor tell what value is correct? The answer is that the maximum time for the longest section of logic can be calculated. The clock ticks at a speed just fast enough to allow it time to complete. Then when the clock switches from one value to the other, the results of the logic are assumed to be complete and stored. It's also a signal that the next set of calculations can begin. But if you hear a system runs at 4 GHz, that means the clock is ticking back and forth four billion times a second. And it does something each one of those ticks.
There's one final part that makes up a processor. I've talked about how transistors form the logic and handle mathematical calculations. Also, how they can store information and decide when they are done with something and ready to move tasks. But I haven't said anything about how these different parts are controlled. At the heart of any computer lies something known as a state machine. At a low level, it is created with logic gates and transistors like all the other parts. But what it does is direct all the other parts of the system. You can think of it as a flow chart. Based on an input it decides what to do next. Then other inputs tell it to do something else. Here is a picture of the state machine from a very simple computer. It probably won't make much sense, but you can think of each bubble as a place where the computer performs a step in solving a problem:
But what are these inputs?
Now we get to code. Every computer has what's called an Instruction Set Architecture (ISA). All that means is it will take a set of binary instructions and do things with it. For instance, one instruction could tell a computer to add two numbers. The state machine has a little counter that tells it to get an instruction from a certain location in memory. It does so, then looks up its instruction code, and discovers it is an addition instruction. It will then move to the addition branch of its flow. Next it gets the values of the two numbers and sends them to the logic gates that handle addition. Finally it will put the result in its place and increment the counter that tells it where to get the next instruction. Finally it returns to the start of its flow to check its memory for the next instruction its counter indicates.
As I said previously, a computer receives its instructions in binary. But if you look at any modern computer language, it's definitely not ones and zeros. That's because it is very, very hard to read and write binary code (I've done it and prefer to avoid that at all costs). Instead, programmers have intermediate program called a compiler. A compiler takes the sort of code most programmers would write and transforms it into binary. The process is more complex than it sounds because high level code (the type programmers work with) and machine code (binary) don't always have direct translations. A compiler includes optimizations to make the result execute faster or take up less space. For example, if an operation always results in the same value it will be hard coded. Code that can never be reached will be ignored. Things that can be done more efficiently in different orders will be moved. Compiler design is extremely complex and a very technical subject.
Our model of a computer now can make decisions and perform calculations based on code we give it. But the computer is still missing something. We can tell it to run a piece of code, but only one at a time. When you use a PC, there are dozens of things happening at once: the display is rendering, sound is playing, you could be moving the mouse, the wifi is downloading a file, you're receiving an email, and other less obvious actions. Each is a different program, and there has to be a way for them to all appear to run at once.
Modern processors have multiple cores. A core is basically a fully contained processor in its own right. It can operate largely independently of the others, but it can only run one program at a time. A set of code running on a processor is known as a thread, and most processors can run a single thread per core. Intel has used real magic in their i7 line of processors that allows two to execute on each core, but that's not really important. There's still the problem that you might have twenty or a hundred threads all trying to run at once. The trick is that they don't *have* to run at once for a computer to operate normally. When I press a key, it doesn't matter if the computer processes it in a microsecond or a millisecond as long as it shows up on the screen before the next display refreshes. The same is true for other threads. A computer uses a piece of code known as a scheduler to decide what thread to run. This looks at all the threads that need to be run and decides which can use the processor next. And it does it in such a way that to you as a user it looks like every thread is running simultaneously.
The scheduler is at the core of what is known as an operating system. The operating system or OS is what differentiates a computer from a single purpose microprocessor. Like I mentioned, it handles the scheduling of different threads. It also provides a buffer between physical hardware and most programs. Why is that important? Well, say you have a program that needs some space to store its data. Without an operating system, there wouldn't be a way for it to know if some other program wasn't already using the space, or even if enough space existed. With an OS, the program simply asks for a chunk of memory to be given to it and the OS handles assignment. It also provides libraries of code that other programs can use. These libraries may do different things based on the specific hardware they run on, but from another program's perspective they provide the same result for the same input. A program doesn't care that its request goes out over Ethernet or wifi, or who made the device. The OS provides a library that handles the hardware and the program just knows its data was sent through the internet.
Likewise, the OS handles interfacing with things like hard drives, USB sticks, keyboards, mice, displays, and any other peripherals. All of that communication is done over standard protocols. These define things like connector shape, how fast you can go, how a device announces its presence when plugged in and so on. Again, the OS does quite a bit of abstraction. It handles the physical communication, and sends instructions or gets back data at the request of other programs.
At this point I've mentioned memory a few times. A computer generally has three types: Cache, RAM, and Disk. Also registers, but those are primarily used for holding data being actively worked on by the CPU.
A cache is built into the processor. You'll see it referred to as L1, L2, and L3, standing for Level 1, Level 2, and Level 3 respectively. The level corresponds to their distance from the cores they service. Each core will have its own L1 cache. They're small (ranging from hundreds of thousands to tens of millions of bytes), but take almost no time to access. Often less than a nanosecond. L2 and L3 are larger and shared across the entire processor. They require several nanoseconds to use, but are still included in the processor. Whatever code is being run or data is being processed will be present in the cache. Generally, if a system has a larger cache, it will run better with large data sets or handle switching between different pieces of code better.
Previously I said RAM can be created purely from transistors. That isn't precisely true. Many types of RAM use various methods, though transistors in an arrangement called a flip-flop are one option. How they work precisely isn't very important compared to what it is used for. RAM - which stands for Random Access Memory - is generally used to hold large chunks of data or programs that aren't in current use but will be shortly. It is nowhere near as fast as a cache, with access times in the microseconds, but still very quick. You also have a high bandwidth. Bandwidth means the amount of data transfered per second and is given by the clock speed of the RAM multiplied by the number of bits transfered per clock cycle. If you've seen a stick of RAM, you know it is very wide. That's good for the number of bits per cycle. Each connection is either a way for the processor to talk to the RAM or the RAM to talk to the processor. But the bandwidth is given by the clock rate times the number of connections and tends to be in the Gigabits per second on modern system The downside is that it is small and expensive. You pay ten to twenty dollars per gigabyte of RAM, and most computers max out at 64 GB. And the large width of the RAM takes up lots of space on a circuit board, which could be used for other purposes. Worse, if you loose power, any data in RAM disappears. This means RAM isn't suitable for long term storage.
Finally, there is disk space. This is where the computer puts things it has no current interest in, but wants to save. Files and unused programs spend their time here. Disk space is cheap and plentiful, usually measured in trillions of bytes. They also don't require power to save your data, instead saving it on a magnetic disk for Hard Disk Drives (HDD) and flash for Solid State Drives (SSD). Unfortunately, they are slow. Very slow. It takes a decent SSD about a millisecond to return a piece of information. An HDD is even slower, taking three to five milliseconds because it has to physically move the disk, unlike an SSD that has no moving parts (hence the term Solid State Drive). That's night and day compared to the nanosecond speeds of a cache. You also don't have the large bandwidth of RAM, as a consequence of a lower clock speed and fewer connections. A disk is a semi truck to RAM's Ferrari to a cache's F1 racer. But they all fit different functions and all are necessary.
As an aside but somewhat related to memory, I'd like to talk about the difference between 32-bit and 64-bit operating systems. More bits is better, right? Yep, but the question is why? Doesn't 64-bit just mean 8 bytes? How is that much better than 4 bytes? Well, every time you story a byte of data in cache or RAM it's assigned an address. In order to make accesses and coding faster, the computer guarantees this address will be no more than a certain number of bits long. For a while, that was 32 bits. But RAM sizes have risen quite a bit over the years. It is very common to have eight or sixteen or thirty-two gigabytes of RAM. But if you do the math, there are only enough values that can be expressed in 32 bits to give four gigabytes of memory addresses. That means older systems running operating systems that can only handle 32-bit addresses are limited to effectively using four gigabytes of RAM. Meanwhile, 64 bit systems are safe until RAM sticks start coming in the million terabyte denominations.
The only big part of a computer I haven't touched on is a Video Card. In all honesty, a video card is a self contained computer. It has its own processor, cache, and RAM, after all! So why do you need it? It comes down to the primary workload of each system. A standard processor is great at going from Point A to Point B, then deciding if it should go to Point C or Point D. It can do this a few billion times per second per core. That should definitely be enough to run a program and display a screen, right? Well...
Conceder that a 1080p screen has a hair over two million pixels. These update sixty times per second and ends up requiring over a hundred and twenty million operations per second. That' not actually too much if the data is provided by an external source. A single core can handle it fairly easily. But what if it's a video game? The value of any given pixel might depend on hundreds of different things. Effects like explosions create thousands of different light sources. Now you need to perform hundreds of billions of operations per second. If you are playing in 4K or VR, you might need trillions. I'll turn it over to the Mythbusters for a visual representation of the process.
The only thing that allows us to keep up with this insane workload is that the different operations are relatively independent. You don't need to know the result of computation number one through nine hundred ninety nine to get the answer to number one thousand. If you had a thousand processors, you could do every calculation at the same time. That's exactly what modern GPUs do. For example, the GTX1080 has just over 2500 different processor cores. Operating at 1.6 GHz it can perform over four TRILLION instructions per second. And each instruction has the ability to do two mathematical operations, leading to the insane capacity to perform 8.2 trillion operations per second. But how does it do that when there's no way you could fit that many i7 chips in that space? Not to mention 2500 i7's would draw over a megawatt of power. The answer is that an i7 has to do lots of things a GPU core doesn't. Remember I said a computer's core was optimized to go from A to B then decide where it should go next? The decide part is very important. Everything after that decision depends on in. Generally code has lots of these decision points. It's only graphics and a handful of other applications like cryptography and bitcoin mining that have lots of independent calculations. That's where GPUs shine, but most of the complexity of a core design comes from those decision points. Since that's not there, a GPU can have thousands of very simple but very basic cores.
There you have it, a thousand foot view of a computer, from top to bottom. This isn't meant to make you an expert. I left out huge areas and hand waved others. You could study some of this for a decade and not be an expert. But it's meant to give you a view into the basics. And if you'd like to know more about any specific parts, let me know. I'd be happy to help.