CSC104 hardware 2

Other local hardware

Other than listening to the CPU fan or feeling the heat it dissipates, you would have a hard time detecting the activity of a computer carrying out instruction like those above, or instructing the computer to change its activities. Devices such as keyboards, mice, drives, and monitors are necessary to insert new content into memory (input) and see the result of changes to memory. These, and other peripheral devices, are typically attached to the bus, and (in some architectures) are each given a range of memory addresses that can be loaded and stored, to and from registers, much as main memory does.

Many peripheral devices contain controllers with direct memory access (DMA) capability, that allow them to load/store bit patterns from and to main memory, without having to make the transfer, via a register on the CPU. This has the advantage of allowing communication between memory and peripheral devices during the times when the CPU is otherwise occupied which lightens the load on the CPU.

The steady trend towards smaller components (transistors and other circuitry) makes computation faster (by reducing the distance signals have to travel on the CPU). Empirical observation (for example, Moore's law) suggest that this leads to a doubling of computer speeds every eighteen months, but there are limiting factors: it becomes harder to safely dissipate the heat of the circuits as you concentrate them in a smaller space, and there is a limit photolithography's ability to reproduce small circuits. Fast computing on the CPU must somehow be integrated with other devices, so the speed that signals can be moved across the bus can become the bottleneck.

Mass storage

The contents of main memory typically disappear when the power is turned off. Longer-term storage is provided by media such as magnetic tape or disks, and compact discs. Storage on these devices is generally cheaper (dollars per bit) than main memory (and hence larger amounts are available), less volatile (although it is too soon to know how long the information will last on them), and have the ability to be removed from the vicinity of a computer and stored elsewhere as an archive.

A disadvantage of mass storage compared to main memory is speed: access to information on disc is about thousands of times slower than from random access memory (RAM). When the media is not connected to a computer, time and human intervention are required to connect them (e.g. slide a disc into a CD drive).

The first magnetic recorder was invented by Valdemar Poulsen in 1895. The read/write heads in magnetic storage record data in the form of magnetized spots on iron oxide coatings.

Magnetic tape: Information is stored on a magnetic coating over a long spool of tape. The tape is wound/rewound under computer control, and may be divided lengthwise into several tracks. Random access to sectors is very slow since it is a serial access medium, and is currently used for archives (although tapes don't last forever, aprox. 3-5 yrs). Large amounts of information can be stored on magnetic tapes and hence the reason why they are so frequently used for archiving and backing up data. Information can be lost on magnetic tapes due to the width of magnetic tape changing, even when stored in a dark, dry space. Magnetic tapes are also sensitive to the Earth's magnetic field, thus after long time periods of storage, some machines may not be able to read the tapes.

Magnetic disc: Information is stored on in concentric circles (tracks) on magnetic disks. Each track is subdivided into sectors, each holding a fixed amount (e.g. 1024 bytes) of information. Since read/write "heads" move in unison between the discs on a common spindle, the combination of tracks a fixed distance from the centre is called a cylinder. The number and location of tracks, sectors, and cylinders is encoded when the disc is manufactured, or formatted. Information from a single sector can be accessed separately from other sectors, so the sector is the granularity of "random access" for this device. Data access times are typically measured in milliseconds (compare this to nanoseconds for RAM). Linear information density on tracks decreases as you move away from the spindle, so the rotation speed is constant (unless recent zoned-bit recording is used). See [Wikipedia on hard disks].
: Hard drives can fail if the read/write heads "crash" into the disk surface (drives are usually sealed to prevent particles that might cause a crash), or due to corrosion or humidity. Mean time between failure may be cited as five years, but your particular drive may be the one that crashes in 5 days. Backups or even redundant disks (RAID) are used to protect against this.

Compact disc: Information is stored by changing the reflective coating on these discs, and read back by shining a laser in noting changes in the reflected light. A single track is laid down in a spiral, from the centre of the disc to the outer edge of the disc. The spiral is divided into sectors, and the linear density is uniform (so data transfer is faster near the outer edge of the spiral, if the CD spins at a uniform rate). CDs have random access to sectors, meaning that information can be easily read from any point on the disk, but this is slower than magnetic discs. Problems that may arise with compact discs are: the plastic coat can change its size, the adhesive layer may fail, and/or the ink could leak from the top layer into the layer containing all the information.

Flash memory: Some USB drives store memory without macroscopic moving parts by trapping electrons in tiny chambers of silicon dioxide. These signals are kept for years without external power, and are suitable for off-line storage. Without moving parts, flash memory is immune to physical shock and some of the other mishaps of moving platters or disks. The drawback that prevents flash memory from replacing current (transistor-based) RAM is that this type of memory is best changed in large blocks (rather than at the byte or bit level), and the silicon dioxide cells are eventually damaged by repeated re-writing of their contents.

From bits to files

The format of the information we want our computer to keep track of (that poetry we're writing, our emails to our mother, our database of favourite tunes) is very unlikely to match the hardware design of the computers mass storage. We're unlikely, for example, to write documents that are some exact multiple of 1024 bytes long, just because our hard drive's sectors are that length. In practice, the information we are interested in is spread over multiple sectors, and the same sector may contain parts of several units of information (commonly called files). We users need to be "protected" from the low-level details of file storage, and computers do this by reserving part of main memory as a buffer, where files are re-assembled when we need to work on them, and then stored in available sectors on the mass storage device when we're finished.

Rather than a soup of numerical addresses of disks, cylinders, sectors, and bytes, the computer presents us with this information grouped into a file, usually with a name formed from a sequence of characters.

The concept of a file is general enough to include files that contain the names and locations of other files. This is what underlies what, as computer users, we deal with as directories or folders. And this trick can be repeated: a directory or folder can contain another directory and folder, creating a hierarchical or tree-like structure of folders. An example is the directory where some student accounts that begin with 'c6' "live" on cdf (directory /h/u4/c6/01. There are 208 of them (yours might well be among them), and some of these have a subdirectories A1, A2, and (soon) A3. A common way to conceive of this is as an upside-down tree. The root (at the top) is a directory that contains subdirectory h (for home directories), and other fundamental directories (such as bin, where important applications are stored). At the bottom are the leaves, individual files that don't contain other files. To describe the path from the top to the bottom, we use some character that isn't part of file names to show all the directories in a chain. For example '/u/heap/pub/alice30.txt' is the path to my file called 'alice30.txt'.

Managing files in a human-intelligible form assumes that our computer runs more than one program. Otherwise, a computer dedicated to a single computation would simply manipulate the details of a cylinder/sector/byte in a single program, and no human (other than the original programmer) would have to deal with this level of detail. However, many computers are not dedicated to a single computation, and this has raised the need for file management, and other functions that we expect of an operating system (OS).