3. Data: Types, Values, Variables, and Names#

A computer program is a series of instructions that operate on data. To build effective programs, you will need to understand several basic building blocks and then how to combine and manipulate those blocks.

Another way to look at programming is to consider Lego bricks.
Legos
Source: https://commons.wikimedia.org/wiki/File:Lego_bricks.jpg

These bricks come in numerous shapes, sizes, and colors. Anyone can quickly start combining them, and novices can produce complex models while following instructions. However, Lego master builders can create truly novel models. They accomplish these builds by deeply understanding how the different pieces work together. They also spend a significant amount of time planning their builds - just as we do when we document the algorithmic approach in the first four steps of the “Seven Steps”.

Similarly, to become a proficient programmer, you must learn a core set of concepts (building blocks). These concepts are universal to many different programming languages; they just vary with slight differences in syntax. (Syntax is the set of rules defining the structure of a computer language.)

We will start our journey of learning these rules by looking at different ways of representing data.

To enable programs to process data, data is defined into different types that specify how data is stored within that type and how we can inspect and manipulate that data.

3.1. Objects#

Python manages all data as objects. From a programming perspective, objects contain:

  • state (properties and values - things an object knows)

  • behavior (methods/ functions - things an object can do)

  • an identity (something that uniquely identifies that item).

To help manage these objects, Python also tracks:

  • an object’s type - this defines what an object can do, and the data it stores

  • a reference count that tracks how many other objects links to this object.

Many other programming languages have the concept of a primitive type, which simply contains a particular value.

Every object is associated with a particular type. That type determines what state (information) an object can know and what methods (behavior) that object has.

Practically, objects help us represent something. Initially, we will just use numbers and text. However, objects can also be more complicated data collections - such as a list of items. In later notebooks, we learn how to build custom object types (classes) to provide abstractions (models) of the world we are trying to represent in our programs.

3.2. Types#

The table below shows Python’s built-in types. While this list may seem overwhelming as a new programmer, we will use most of these regularly (the notebooks will not use the complex type and will only use the bytes and bytearray types infrequently). Right now, just look at the different types of numbers (int and float) and the string type. The following notebook will cover these types more in-depth. Ints, floats, and strings should feel very familiar to you as you interact with these values regularly - these types represent counts, money, tips, messages, etc.

Name

Type

Mutable?

Examples

Boolean

bool

no

True, False

Integer

int

no

42, 17966, 17_966

Floating point

float

no

3.14159, 2.7e5

Complex

complex

no

3j, 5+ 9j

Text string

str

no

‘Duke’,”University”, ‘’’a multiline string’’’

List

list

yes

[“Duke”, “NC State”, “Notre Dame”, “UNC”]

Tuple

tuple

no

(2,5,8)

Bytes

bytes

no

b’ab\xff’

Byte array

bytearray

yes

bytearray(…)

Set

set

yes

set([1,2,3,5,7,11,13,19])

Frozen set

frozenset

no

frozenset([‘Elsa’, ‘Anna’, ‘Olaf’])

Dictionary

dict

yes

{‘Pratt’ : ‘School of Engineering’, ‘Fuqua’ : ‘School of Business’, ‘Sanford’ : ‘School of Public Policy’}

Mutable indicates if an object’s properties (state) can be changed once that object has been created.

So, once we create a string or an integer, we can no longer change its value. Any updates to the value result in a new object.

As an analogy, consider a clear, sealed box. You can peak at the contents, but you cannot touch or manipulate the contents of that box.

Within Python, a value can be expressed as a literal value or as a variable. The preceding table shows literal values within the example column.

3.3. Variables#

A variable is a name that refers to a value (i.e., an object). While you can think of variables as a place to store data in the computer’s memory, more accurately, in Python, variables are basically names (labels) attached to objects.

This code block declares three variables:

1x = 6
2pi = 3.14159
3message = "Hello World!"

x refers to an integer object, pi refers to a floating-point object, and message refers to a string(text) object. The equals sign (=) specifies an assignment statement. Programmers read these statements as

  • Set x to the integer 6

  • Set pi to the float 3.14159

  • Set message to the string “Hello World!”

Those above statements created three variables - x, pi, and message - and assigned to them a reference to the value on the right side of the statement. Those values on the right-hand side are objects of their respective types (int, float, and string)

As you can see in diagram, the variable message points to an object stored in the computer’s memory. We call this “link” a reference and normally use the verb “references”. As mentioned above, In addition to tracking an object’s state (the data an object contains), a Python object tracks an identifier, how many variables or other objects can refer to this variable, and the object’s type.

Please realize that these statements are not equations that we have in algebra or other advanced mathematics - while you might expect = to represent equals, the = operator represents assignment. The interpreter evaluates the right side of the assignment operator and then assigns the result to a variable on the left side of the assignment operator.

To get a current variable’s type, we use the built-in function type()

1type(pi)
float

As an exercise, type in the code below to see what the other types are for x and message by replacing pi with those variable names.

Try defining variables of the other types from the above table. Use the following cells to experiment:

1# enter code to see what the types of x and message
1# enter code to create other variables with different types from the types table.

3.4. Typing#

Unlike many other programming languages, Python utilizes dynamic typing. The interpreter determines the type of a variable when an assignment statement occurs. As such, variables in Python can change their types as the program executes. However, from a best practices standpoint, you should be consistent with your use of types for each variable and avoid type changes.

3.5. Functions#

In computer programming, a function is a named, reusable block of code (series of statements) that performs some action or computation. Functions can define any number of parameters to receive data from the calling code block. Functions can then return a value (result) to the calling block. Initially, we will just use Python’s built-in functions, but we will develop our own functions to perform specific tasks.

You can print out the current value of a variable with the built-in function print(). Again, try printing the other variables once you execute this code block.

1print(message)
Hello World!
1# add some more print statements to display the values of the variables declared earlier in this notebook.

Before you run the following statement, think about how this notebook will respond.

1print(msg)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 print(msg)

NameError: name 'msg' is not defined

You should have received a “NameError” specifying that the variable named “msg” is not defined. One of the challenges while programming is finding and correcting mistakes such as this.

Before we use a variable on the right-hand side of an assignment or in other statements, we must first define that variable by assigning a value to it.

3.5.1. Variable Naming Rules#

As you name a variable, you should create names that represent the purpose of that variable. The variable name must also follow these rules:

  • can only contain lowercase letters (a-z), uppercase letters (A-Z), digits (0-9), and an underscore(_)

  • cannot begin with a digit

  • cannot be one of Python’s reserved keywords.

Variable names are case-sensitive.

By convention, variable names in Python are lowercase with words separated by underscores to improve readability:

1num_seconds_per_hour = 3600

Use informative names for variables to assist yourself and others in reading your code. What if we used g = 3600 in the preceding code block? Would you be able to figure out the purpose and meaning of g? What happens if you revisit the code a year from now?

To see the list of reserved keywords, execute the following statement:

1help("keywords")
Here is a list of the Python keywords.  Enter any keyword to get more help.

False               class               from                or
None                continue            global              pass
True                def                 if                  raise
and                 del                 import              return
as                  elif                in                  try
assert              else                is                  while
async               except              lambda              with
await               finally             nonlocal            yield
break               for                 not                 

This output (the list of keywords) may not yet make sense, but by the end of the course, we will cover most of these keywords.

To see more information about a particular keyword, type help(”keywordName”). Then, examine a couple of other keywords.

1help("for")
The "for" statement
*******************

The "for" statement is used to iterate over the elements of a sequence
(such as a string, tuple or list) or other iterable object:

   for_stmt ::= "for" target_list "in" starred_list ":" suite
                ["else" ":" suite]

The "starred_list" expression is evaluated once; it should yield an
*iterable* object.  An *iterator* is created for that iterable. The
first item provided by the iterator is then assigned to the target
list using the standard rules for assignments (see Assignment
statements), and the suite is executed.  This repeats for each item
provided by the iterator.  When the iterator is exhausted, the suite
in the "else" clause, if present, is executed, and the loop
terminates.

A "break" statement executed in the first suite terminates the loop
without executing the "else" clause’s suite.  A "continue" statement
executed in the first suite skips the rest of the suite and continues
with the next item, or with the "else" clause if there is no next
item.

The for-loop makes assignments to the variables in the target list.
This overwrites all previous assignments to those variables including
those made in the suite of the for-loop:

   for i in range(10):
       print(i)
       i = 5             # this will not affect the for-loop
                         # because i will be overwritten with the next
                         # index in the range

Names in the target list are not deleted when the loop is finished,
but if the sequence is empty, they will not have been assigned to at
all by the loop.  Hint: the built-in type "range()" represents
immutable arithmetic sequences of integers. For instance, iterating
"range(3)" successively yields 0, 1, and then 2.

Changed in version 3.11: Starred elements are now allowed in the
expression list.

Related help topics: break, continue, while

3.6. Statements#

The assignments and function calls are examples of statements - a unit of instruction to the computer to perform a specific action. Programs are just a series of these statements.

3.7. Objects Redux#

To revisit the start of this notebook, Python manages all data as objects. As such, we typically have ways to inspect particular object’s state as well as to execute methods on those objects.

While the data shown in this notebook have relatively simple states (i.e., just the value of the integer, float, or string datatype), they do have behavior (additional methods) defined.

Integer values are represented as binary numbers within the computer’s memory. To see how long a particular representation is (i.e., how many zeros and ones are needed), we can call the bit_length() method on an integer. bit_length() is an example of object behavior.

1x = 65535
2x.bit_length()
16

When you see a pattern such as object_name.method() (e.g., x.bit_length()) that is using the behavior associated with that object. Programmers will read this as calling method_name for object_name. Methods are the same as functions, except that they belong to a particular object.

As with functions, we use predefined object types (classes) as well as writing our own.

To see the available methods on an object, we use the built-in dir() function. You can call dir() with a literal value, a type name, or a variable as the argument. For example,

dir(65535)
dir(int)
dir(x)

will all return the same output.

1dir(int)
['__abs__',
 '__add__',
 '__and__',
 '__bool__',
 '__ceil__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floor__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getnewargs__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__le__',
 '__lshift__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rlshift__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__round__',
 '__rpow__',
 '__rrshift__',
 '__rshift__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__trunc__',
 '__xor__',
 'as_integer_ratio',
 'bit_count',
 'bit_length',
 'conjugate',
 'denominator',
 'from_bytes',
 'imag',
 'is_integer',
 'numerator',
 'real',
 'to_bytes']

As you examine the output for dir(int), notice many methods begin and end with double underscores. Referred to as “dunder” methods, these methods serve as special functions within Python: initialize an object __init__, produce a string representation of an object __str__, or override built-in functions and operators (abs() __abs__, = __eq__, etc.) Normally, we do not directly call these methods but rather implicitly invoke them through other operations.

For example, the following code uses __abs__() from the int class to get the absolute value.

1a = -50
2a = abs(a)
3print(a)
50

3.8. Bringing Concepts Together#

One way to help think of variables and types is to relate those concepts to buildings in a city.

Variables equate to the different buildings. It’s a container that holds something valuable, like information or data. Just as buildings come in different types (residential, commerical, industrial), variables can hold different types of data such as numbers or text. And as each type of building serves a specific function, different types are designed to hold different kinds of information and provide different functionality. The value of a variable is what’s stored inside, much like the purpose of a building. For example, a residential building might contain people, a commercial building might contain goods or services, and a variable might contain a number, a word, or the result of a calculation. When you assign a value to a variable, it’s like putting something inside the building. You’re giving it a purpose and filling it with something meaningful. Variables are stored in memory, similar to how buildings occupy physical space in the city. The computer allocates memory to store the values of variables so that they can be accessed and manipulated during the execution of the program.

3.9. Getting Help#

To get help on a particular method, we can use the help() method. Pass either an object reference or the type name along with the method name to that function.

1help(int.bit_length)
Help on method_descriptor:

bit_length(self, /)
    Number of bits necessary to represent self in binary.

    >>> bin(37)
    '0b100101'
    >>> (37).bit_length()
    6
1help(x.bit_length)
Help on built-in function bit_length:

bit_length() method of builtins.int instance
    Number of bits necessary to represent self in binary.

    >>> bin(37)
    '0b100101'
    >>> (37).bit_length()
    6

We can also just get help information on the data type itself.

1help(int)
Hide code cell output
Help on class int in module builtins:

class int(object)
 |  int([x]) -> integer
 |  int(x, base=10) -> integer
 |
 |  Convert a number or string to an integer, or return 0 if no arguments
 |  are given.  If x is a number, return x.__int__().  For floating point
 |  numbers, this truncates towards zero.
 |
 |  If x is not a number or if base is given, then x must be a string,
 |  bytes, or bytearray instance representing an integer literal in the
 |  given base.  The literal can be preceded by '+' or '-' and be surrounded
 |  by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
 |  Base 0 means to interpret the base from the string as an integer literal.
 |  >>> int('0b100', base=0)
 |  4
 |
 |  Built-in subclasses:
 |      bool
 |
 |  Methods defined here:
 |
 |  __abs__(self, /)
 |      abs(self)
 |
 |  __add__(self, value, /)
 |      Return self+value.
 |
 |  __and__(self, value, /)
 |      Return self&value.
 |
 |  __bool__(self, /)
 |      True if self else False
 |
 |  __ceil__(...)
 |      Ceiling of an Integral returns itself.
 |
 |  __divmod__(self, value, /)
 |      Return divmod(self, value).
 |
 |  __eq__(self, value, /)
 |      Return self==value.
 |
 |  __float__(self, /)
 |      float(self)
 |
 |  __floor__(...)
 |      Flooring an Integral returns itself.
 |
 |  __floordiv__(self, value, /)
 |      Return self//value.
 |
 |  __format__(self, format_spec, /)
 |      Convert to a string according to format_spec.
 |
 |  __ge__(self, value, /)
 |      Return self>=value.
 |
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |
 |  __getnewargs__(self, /)
 |
 |  __gt__(self, value, /)
 |      Return self>value.
 |
 |  __hash__(self, /)
 |      Return hash(self).
 |
 |  __index__(self, /)
 |      Return self converted to an integer, if self is suitable for use as an index into a list.
 |
 |  __int__(self, /)
 |      int(self)
 |
 |  __invert__(self, /)
 |      ~self
 |
 |  __le__(self, value, /)
 |      Return self<=value.
 |
 |  __lshift__(self, value, /)
 |      Return self<<value.
 |
 |  __lt__(self, value, /)
 |      Return self<value.
 |
 |  __mod__(self, value, /)
 |      Return self%value.
 |
 |  __mul__(self, value, /)
 |      Return self*value.
 |
 |  __ne__(self, value, /)
 |      Return self!=value.
 |
 |  __neg__(self, /)
 |      -self
 |
 |  __or__(self, value, /)
 |      Return self|value.
 |
 |  __pos__(self, /)
 |      +self
 |
 |  __pow__(self, value, mod=None, /)
 |      Return pow(self, value, mod).
 |
 |  __radd__(self, value, /)
 |      Return value+self.
 |
 |  __rand__(self, value, /)
 |      Return value&self.
 |
 |  __rdivmod__(self, value, /)
 |      Return divmod(value, self).
 |
 |  __repr__(self, /)
 |      Return repr(self).
 |
 |  __rfloordiv__(self, value, /)
 |      Return value//self.
 |
 |  __rlshift__(self, value, /)
 |      Return value<<self.
 |
 |  __rmod__(self, value, /)
 |      Return value%self.
 |
 |  __rmul__(self, value, /)
 |      Return value*self.
 |
 |  __ror__(self, value, /)
 |      Return value|self.
 |
 |  __round__(...)
 |      Rounding an Integral returns itself.
 |
 |      Rounding with an ndigits argument also returns an integer.
 |
 |  __rpow__(self, value, mod=None, /)
 |      Return pow(value, self, mod).
 |
 |  __rrshift__(self, value, /)
 |      Return value>>self.
 |
 |  __rshift__(self, value, /)
 |      Return self>>value.
 |
 |  __rsub__(self, value, /)
 |      Return value-self.
 |
 |  __rtruediv__(self, value, /)
 |      Return value/self.
 |
 |  __rxor__(self, value, /)
 |      Return value^self.
 |
 |  __sizeof__(self, /)
 |      Returns size in memory, in bytes.
 |
 |  __sub__(self, value, /)
 |      Return self-value.
 |
 |  __truediv__(self, value, /)
 |      Return self/value.
 |
 |  __trunc__(...)
 |      Truncating an Integral returns itself.
 |
 |  __xor__(self, value, /)
 |      Return self^value.
 |
 |  as_integer_ratio(self, /)
 |      Return a pair of integers, whose ratio is equal to the original int.
 |
 |      The ratio is in lowest terms and has a positive denominator.
 |
 |      >>> (10).as_integer_ratio()
 |      (10, 1)
 |      >>> (-10).as_integer_ratio()
 |      (-10, 1)
 |      >>> (0).as_integer_ratio()
 |      (0, 1)
 |
 |  bit_count(self, /)
 |      Number of ones in the binary representation of the absolute value of self.
 |
 |      Also known as the population count.
 |
 |      >>> bin(13)
 |      '0b1101'
 |      >>> (13).bit_count()
 |      3
 |
 |  bit_length(self, /)
 |      Number of bits necessary to represent self in binary.
 |
 |      >>> bin(37)
 |      '0b100101'
 |      >>> (37).bit_length()
 |      6
 |
 |  conjugate(...)
 |      Returns self, the complex conjugate of any int.
 |
 |  is_integer(self, /)
 |      Returns True. Exists for duck type compatibility with float.is_integer.
 |
 |  to_bytes(self, /, length=1, byteorder='big', *, signed=False)
 |      Return an array of bytes representing an integer.
 |
 |      length
 |        Length of bytes object to use.  An OverflowError is raised if the
 |        integer is not representable with the given number of bytes.  Default
 |        is length 1.
 |      byteorder
 |        The byte order used to represent the integer.  If byteorder is 'big',
 |        the most significant byte is at the beginning of the byte array.  If
 |        byteorder is 'little', the most significant byte is at the end of the
 |        byte array.  To request the native byte order of the host system, use
 |        `sys.byteorder' as the byte order value.  Default is to use 'big'.
 |      signed
 |        Determines whether two's complement is used to represent the integer.
 |        If signed is False and a negative integer is given, an OverflowError
 |        is raised.
 |
 |  ----------------------------------------------------------------------
 |  Class methods defined here:
 |
 |  from_bytes(bytes, byteorder='big', *, signed=False) from builtins.type
 |      Return the integer represented by the given array of bytes.
 |
 |      bytes
 |        Holds the array of bytes to convert.  The argument must either
 |        support the buffer protocol or be an iterable object producing bytes.
 |        Bytes and bytearray are examples of built-in objects that support the
 |        buffer protocol.
 |      byteorder
 |        The byte order used to represent the integer.  If byteorder is 'big',
 |        the most significant byte is at the beginning of the byte array.  If
 |        byteorder is 'little', the most significant byte is at the end of the
 |        byte array.  To request the native byte order of the host system, use
 |        `sys.byteorder' as the byte order value.  Default is to use 'big'.
 |      signed
 |        Indicates whether two's complement is used to represent the integer.
 |
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  denominator
 |      the denominator of a rational number in lowest terms
 |
 |  imag
 |      the imaginary part of a complex number
 |
 |  numerator
 |      the numerator of a rational number in lowest terms
 |
 |  real
 |      the real part of a complex number

3.10. Jupyter Notes#

Within the Jupyter environment, we can also access the help documentation with a ? after an item:

1int?

You can also see the relevant source code of the object by typing ?? after an item. If the source code does not appear, the function or object is not directly implemented in Python but instead in C or C++.

1int??

In Jupyter, as you type an object or function name, you can press the tab key to autocomplete the current term or see a pop-up list of possible choices. After each of the items in the following code block, press the Tab key.

1a.bit_l
2a.
  Cell In[19], line 2
    a.
      ^
SyntaxError: invalid syntax

One final note concerning Jupyter. We have used a mixture of just using a variable by itself on the last line of a code block or printing that value. If the last line of a code block returns a value, Jupyter automatically takes that return value and prints the value. Simply typing a variable name by itself returns the value of that variable.

1a
50

We have also explicitly printed the value by calling the print() function.

1print(a)
50

While the output has been the same in these two examples, that does not always occur.

If we just have a value by itself on a line and then another statement, Jupyter will not display the value. The following code block does not show any value.

1a
2x = 10 

3.11. Suggested LLM Prompts#

  • Explain objects in Python programming.

  • Explain types in Python.

  • How do types and variables differ in Python?

  • How do types and variables interact in Python?

  • List 5 guidelines for good variable names in Python.

  • What data type is used to store sentences in Python?

  • What is the significance of Python’s keywords? Why cannot these words be used as variable names?

3.12. Review Questions#

  1. For each of the following literal values, what is the corresponding type?

    1. 1000
    2. -42
    3. 4.2
    4. 'a'
    5. "alpha"
  2. Explain the naming rules for variables.

  3. What guidelines should you following in naming variables?

  4. What do all objects contain?

  5. What built-in function displays the value of variables or literals to the screen (console)?

  6. Within the Python interpreter, how can we get more information about a particular keyword?

  7. In Python, how does a variable determine its type? Can this type ever change?

  8. Are variable names case-sensitive?

answers

3.13. Drill#

In a terminal window, open a python interpreter shell (i.e., execute python3 or python). Then perform the following operations: (Use a literal of the appropriate type based upon the context of the variable name.)

  1. Create a variable named pi and assign a value to it.

  2. Create a variable named gpa and assign a value to it.

  3. Create a variable named num_fingers and assign a value to it.

  4. Create a variable named is_open and assign a value to it.

  5. Create a variable named country and assign a value to it.

  6. Create a variable named year and assign a value to it.

  7. create a variable named first_name and assign a value to it.

  8. For each of the variables created, print the value to the console.

  9. For each of the variables created, verify the type with the built-in function type()?