# 4.1 Introduction to Abstract Data Types

In the first few weeks of the course, we have mainly played the role of the *class
designer and implementer*.
However, you have actually spent most of your programming career in the
opposite role: whenever you use one of Python's built-in functions or data structures, you only worry about what it does, not how it works.
In other words, you are a *client* of the built-in Python libraries and classes.

Some concepts are so general and so useful across many problems that they transcend any specific programming language.
An **abstract data type** (or ADT)
defines some kind of data and the operations that can be performed on it.
It is a pure interface,
with no mention of an implementation---that's what makes it *abstract*.
<!-- David -->
In contrast to this, a **data structure**
is a concrete strategy for storing some data.
For example, one data structure we could use to store the grades of a class on a series of course activities
is a list of lists:

```python
grades = [['Sadia', 78, 82],
          ['Yuan', 75, 64],
          ['Elise', 80, 71]]
# To know what the "columns" are for,
# we could use another list:
items = ['A1', 'midterm']
```

An alternative data structure is a dictionary of dictionaries:
```Python
g2 = {'A1': {'Sadia': 78, 'Yuan': 75, 'Elise': 80},
      'midterm': {'Sadia': 82, 'Yuan': 64, 'Elise': 71}}
```

You can very likely think of other options, and may find it interesting
to consider pros and cons of the two data structures above.
But the point here is that
ADTs are fundamentally concerned with the *what*: what data is stored, and what we can do with this data.
Data structures are concerned with the *how*: how is that data stored, and how do we actually implement our desired methods to operate on this data?
This distinction is crucial to the kind of abstract thinking you'll develop as programmers: by separating the *what* from the *how*,
you'll gain substantial benefits,
as we are about to learn.

## Some famous abstract data types

In this section, we'll briefly describe some of the most common abstract data types in computer science.
Now, while computer scientists generally agree on what the "main" abstract data types are, they often disagree on what operations each one actually supports.
You'll notice here that we've taken a fairly conservative approach for specifying operations, limiting ourselves to the most basic ones.

-   **Set**

    -   Data: a collection of unique elements
    -   Operations: get size, insert a value (without introducing duplicates), remove a specified value, check membership

-   **Multiset**

    -   Data: a collection of elements (possibly with duplicates)
    -   Operations: same as Set, but the insert operation allows duplicates

-   **List**

    -   Data: an ordered sequence of elements
    -   Operations: access element by index, insert a value at a given index, remove a value at a given index

-   **Map**

    -   Data: a collection of key-value pairs, where each key is unique and associated with a single value
    -   Operations: lookup a value for a given key, insert a new key-value pair, remove a key-value pair, update the value associated with a given key

-   **Iterable**

    -   Data: a collection of values (may or may not be unique)
    -   Operations:
        iterate through the elements of the collection one at a time.

Many of these will sound familiar, as you'll have used them in your past programming experience, even if you haven't heard the term "abstract data type" before!
For example, you have probably used both a Python `list` and `dict`.
However, there is a really important distinction between Python's built-in classes and the ADTs we've listed above:
`list` and `dict` are data structures, *not* abstract data types.
That is, they're concrete implementations, and every necessary decision about *how* these classes store their data and implement their methods has been made.
Indeed, the designers of the Python programming language put a great deal of effort into
these implementations so that `list` and `dict` operations are extremely fast.

So a `dict`, for instance, is not itself an ADT.
But it is fair to say that a `dict` is a natural implementation of the Map ADT.
However,
*there is NOT a one-to-one correspondence between ADTs and data structures*,
in Python or any other language.

A single ADT can be implemented by many different data structures.
For example, although the Python `list` is a natural implementation of the List ADT, we could implement it instead with a `dict` in which each key is the index of its item.
A list of 3 elements, `hello`, 42, and `goodbye` in positions 0, 1, and 2 respectively, would be

```python
{0: 'hello', 1: 42, 2: 'goodbye'}
```

On the flip side, each data structure can be used to implement multiple ADTs.
The Python `list` can be used to implement not just the List ADT, but each of the other above ADTs as well.
For instance, think about how you would implement the Set ADT with a `list`, and in particular, how you would avoid duplicates.[^1]
A `dict` could also implement any of the ADTs above, and the same is true of the new data structures you will learn in this course.


## The value of knowing all the standard ADTs

We've said that each ADT is so general that it transcends any individual problem or program or even programming language.
And in fact the ADTs given above are implemented in other programming languages (to varying degrees).
While the exact data structures used to implement them vary significantly from language to language,
these ADTs concepts form a common vocabulary whose understanding is necessary to being a professional computer scientist.

[^1]: Beginning Python programmers often implement the Set ADT with a `list`, but Python has a built-in `set` class that implements the Set ADT, does all the work of duplicate-avoidance for you, and does it very efficiently.
