7. Vectors#

One of the advantages Python offers is the many built-in data structures - lists, dictionaries, and many others.C++ offers the Standard Template Library (STL). While the class explores the STL library more in-depth later, we’ll first start with the vector class that provides similar functionality to Python’s list, but with the requirement that only objects of a single type can be stored within it. Behind the scenes, vector is a dynamic array implementation that stores elements sequentially. Arrays are contiguous sequences of memory that hold objects of a common type. Unlike regular arrays, vectors can grow and shrink their memory footprint and track their actual size.

The program below demonstrates using the vector class. In line 11, we declare (and initialize) an empty vector items that holds values of type string. The < > syntax identifies a template in C++. We can use different types to create vectors that hold different types such as double and int. While we learn how to create template types later, just realize that you can use any type there.

 1//filename: vector.cpp
 2//complile: g++ -std=c++17 -o vector vector.cpp
 3//execute: ./vector
 4#include <iostream>
 5#include <vector>              // Include the vector header
 6#include <string>
 7#include <algorithm>           // Will use to sort the list
 8
 9using std::cin, std::cout;  // bring in these names from the std namespace
10using std::string, std::vector;        
11
12int main(int argc, char *argv[]) {
13    string item;
14    vector<string> items;    // creates an empty vector;
15
16    cout << "Enter a series of strings to add to the vector, ctrl-d to stop: ";
17    while (cin >> item) {
18        items.push_back(item);
19    }
20
21    cout << "Number of items: " << items.size() << "\n";
22
23    std::sort(items.begin(), items.end());         // begin() and end() are iterators - see below
24
25    cout << "Iterate over the list with a for-each loop :" <<  "\n";
26    for (string s: items) {
27        cout << s <<  "\n";
28    }
29
30    return EXIT_SUCCESS;
31}

7.1. Declare and Initialize#

C++ provides several different ways to declare and initialize vectors:

 1//filename: declare.cpp
 2//complile: g++ -std=c++17 -o declare declare.cpp
 3//execute: ./declare
 4#include <vector> 
 5using std::vector; 
 6
 7int main(int argc, char *argv[]) {
 8    int n = 4;
 9    vector<int> vec;                         // empty vector holding objects of type int.
10    vector<int> five{1, 2, 3, 4, 5};         // vector contains as many elements as there are 
11                                             // initializers (values with { } on the right-hand side)
12    vector<int> alt_five = {1, 2, 3, 4, 5};  // equivalent to the above declaration
13    vector<int> another(alt_five);           // another has a copy of each element of alt_five
14    vector<int> v1(n);                       // vector has n copies of a value-initialized object.
15                                             // (would be the default constructor or zero for numbers)
16    vector<int> v2(n,5);                     // holds n copies of the value 5.
17}

While it may be tempting to immediately create a vector of a specific size if you know that size, you need to be wary that behind the scenes, C++ constructs objects and places them into the vector. Instead, you should declare the vector, and then call reserve with the appropriate size. The vector library, though, is very efficient at growing. The following code demonstrates creating a vector filled with random numbers and shows how the capacity doubles as additional memory space is needed.

 1//filename: random.cpp
 2//complile: g++ -std=c++17 -o random random.cpp
 3//execute: ./random
 4#include <random>
 5#include <iostream>
 6#include <vector>    
 7#include <string>
 8using std::cout, std::vector;  
 9
10int main() {
11    std::random_device rd;  // Random engine
12    std::mt19937 eng(rd());   // Mersenne Twister engine seeded with rd().  Can fix the seed with an int
13    
14    // Create an object that produces random numbers from the Uniform distribution between 1 and 1000
15    std::uniform_int_distribution<> distr(1, 1000);  
16
17    int random_integer = distr(eng);
18    cout << "Random number: " << random_integer <<  "\n";
19
20    vector<int> items; 
21    int capacity_increases = 0;
22    int last_capacity = 0; 
23    for (int i=1; i < 1000; i++) {
24        items.push_back( distr(eng));  // add a random number to the vector
25        if (items.capacity() != last_capacity) { // check if the vector capacity increases
26            capacity_increases++;
27            last_capacity = items.capacity();
28            cout << i << " " << last_capacity <<  "\n";
29        }
30    }
31    cout << "Total number of increases: " << capacity_increases <<  "\n";
32
33    return 0;
34}

7.2. Accessing Elements#

Use either the [] operator or the at() method to access elements.

The [] operator is slightly faster as it does not perform bounds checking. However, this could lead to undefined behavior when accessing memory not appropriately initialized. This method is preferred when the boundary sizes are known (which should be almost always). The at() method performs bounds checking and will throw a std::out_of_range exception if you attempt to access an element at an invalid index. This allows you to handle the error gracefully using a try-catch block.

C++ also provides special methods front() and back() to access the ends of the vector. We could also explicitly use 0 and size()-1.

 1//filename: access.cpp
 2//complile: g++ -std=c++17 -o access access.cpp
 3//execute: ./access
 4#include <vector> 
 5#include <iostream>
 6using std::vector; 
 7
 8int main(int argc, char *argv[]) {
 9    vector<int> vec{1, 2, 3, 4, 5}; 
10
11    // note the use of size_t as the type for i: https://en.cppreference.com/w/cpp/types/size_t
12    for (size_t i=0; i < vec.size(); i++) {
13        std::cout << vec[i] << " - " << vec.at(i) <<  "\n";
14    } 
15
16    //DANGER - demonstrating how the [] does not perform bounds checking.
17    std::cout << vec[10] << std::endl;
18
19    // Using at() - With bounds checking
20    try {
21        std::cout << vec.at(7) << std::endl;    // throws an exception
22    } catch(const std::out_of_range& e) {
23        std::cout << "Out of Range error: " << e.what() <<  "\n";
24    }
25
26    std::cout << "First element: " << vec.front() << " -- " << vec[0] <<  "\n";
27    std::cout << "Last element: " << vec.back() << " -- " << vec[vec.size()-1]  <<  "\n";
28}

7.3. Adding and Removing Elements#

C++ provides several methods to add and remove elements. We’ve already demonstrated push_back() which adds an element to the end of the list. pop_back() removes an element from the end of a list, but does not return a value. __insert(_pos_, _element_)__inserts an element at the specified position. Rather than requiring an integral value, we’ll need to use an iterator value - which we can access with either begin() or end(). Then we can use arithmetic to get to the correct position. erase(_pos_) removes an element at the specified position. You can also specify a range of elements to remove. We’ll need to provide two iterators to erase() function: the iterator pointing to the first element to be removed and the iterator pointing to one past the last element to be removed - that is, a range of elements [first, last). clear() removes all elements from the vector.

 1//filename: insert.cpp
 2//complile: g++ -std=c++17 -o insert insert.cpp
 3//execute: ./insert
 4#include <vector> 
 5#include <iostream>
 6using std::vector; 
 7
 8// Defining a templated function so that we can print vectors of any type
 9template<typename T>
10void printVector(const std::vector<T>& vec) {
11    std::cout << "{ ";
12    bool first = true;
13    for (const auto& elem : vec) {   // This is a fencepost loop.  Special case the first
14        if (first) {
15            std::cout << elem;
16            first = false;
17        }
18        else {
19            std::cout << ", " << elem;
20        }
21    }
22    std::cout << " }" <<  "\n";
23}
24
25int main(int argc, char *argv[]) {
26    vector<int> vec{1, 2,  4, 5}; 
27
28    vec.insert(vec.begin() +2 ,3);
29    vec.insert(vec.end(),8);
30    vec.insert(vec.end()-1,7);   // what happens if you try vec.end() + 1 
31    printVector(vec);
32
33    // Insert multiple copies of an element
34    vec.insert(vec.begin(), 2, 0); // Insert two '0's at the start
35    printVector(vec);
36
37    // remove the first element
38    vec.erase(vec.begin());
39    printVector(vec);
40
41    // remove the last two elements
42    vec.erase(vec.end()-2,vec.end());
43    printVector(vec);
44
45    vec.clear();
46    printVector(vec);
47}

7.4. Size and Capacity#

Unlike using native arrays directly, we can obtain the current size (number of elements currently in the vector) and the capacity (the number of elements the vector can store without having to allocate more memory).

 1//filename: capacity.cpp
 2//complile: g++ -std=c++17 -o capacity capacity.cpp
 3//execute: ./capacity
 4#include <vector> 
 5#include <iostream>
 6using std::vector; 
 7
 8
 9int main(int argc, char *argv[]) {
10    vector<int> vec{1, 2, 3};
11
12    std::cout << "Initial size: " << vec.size() <<  "\n";
13    std::cout << "Initial capacity: " << vec.capacity() <<  "\n";
14
15    vec.push_back(4);  // note the doubling of the capacity
16    std::cout << "Size: " << vec.size() <<  "\n";
17    std::cout << "Capacity: " << vec.capacity() <<  "\n";
18} 

While not shown in the above code, both size() and capacity() generally return an unsigned long as the result type. (Technically, the exact type is vector<int>::size_type.) While in many cases you can use an int, depending upon compiler options, this may cause a warning and/or error. Generally, you should use size_t or vector<elementType>::size_type as the type. This page demonstrates this in the first loop under “Accessing Elements”. https://en.cppreference.com/w/cpp/types/size_t

    // note the use of size_t as the type for i
    for (size_t i=0; i < vec.size(); i++) {
        std::cout << vec[i] << " - " << vec.at(i) <<  "\n";
    } 

7.5. Iterators#

This page has already demonstrated iterating through a vector using a for-each loop as well as a standard for loop with indexes. We presented iterator variables as we added and removed items with insert() and erase(). These iterators give us indirect access to an object - an element in a container such as a vector. We can use an iterator to fetch an element by using the dereference operator, *, from the specified location in the iterator variable. Unsurprisingly, we can use these variables to move through a vector. We can use arithmetic to move forward and backward in the vector with these iterator variables. Use begin() and end() to perform necessary bounds checking. A valid iterator signifies an element or a position just beyond the container’s last element. All other iterator values are invalid. Note the use of the auto keyword to define the iterator type. Except in extremely rare circumstances, knowing the exact type is unnecessary.

 1//filename: iterate.cpp
 2//complile: g++ -std=c++17 -o iterate iterate.cpp
 3//execute: ./iterate
 4#include <vector> 
 5#include <iostream>
 6using std::vector; 
 7
 8
 9int main(int argc, char *argv[]) {
10    vector<int> vec{1, 2, 3};
11
12    for (auto it = vec.begin(); it != vec.end(); it++) {
13        std::cout << *it << ' ';
14    }
15    for (auto it = vec.end(); it >= vec.begin(); it--) { // note the comparison
16        std::cout << *it << ' ';
17    }
18    std::cout << std::endl;
19
20}

7.6. Memory Management#

One of the great things about using vector is not having to worry about memory. The class makes sensible decisions for us. However, based on specialized tasks (such as declaring a large vector that we know we’ll use or freeing up space from a large vector that’s no longer being used), we can call specific functions to reserve memory or to free memory.

  • resize(n) changes the size of the vector to n.

  • reserve(n) reserves at least n elements worth of memory.

  • shrink_to_fit() reduces memory usage by freeing unused space.

 1//filename: memory.cpp
 2//complile: g++ -std=c++17 -o memory memory.cpp
 3//execute: ./memory
 4#include <vector> 
 5#include <iostream>
 6using std::vector; 
 7
 8
 9int main(int argc, char *argv[]) {
10    vector<int> vec{1, 2, 3};
11
12    std::cout << "1 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";
13
14    vec.reserve(10);  // increase capacity to 10
15    std::cout << "2 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";
16
17    vec.push_back(4);
18    vec.shrink_to_fit();
19    std::cout << "3 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";
20
21    vec.resize(2); // truncates to just the first two elements
22    std::cout << "4 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";
23
24}

7.7. Combining functions, strings, and vectors#

Now that we have covered many of the basics in C++, we can start to develop useful, reusable functions. Below, we take the split code from the string docable and then encapsulate it into a function. Within main, we’ve written a small series of test cases to validate our function. You should comment lines 23 and 25 (so the last string is always added to the result) to see how things change.

 1//filename: parseLine.cpp
 2//complile: g++ -std=c++17 -o parseLine parseLine.cpp
 3//execute: ./parseLine
 4#include <iostream>
 5#include <string>
 6#include <vector>
 7
 8/**
 9 * @brief Split apart a line based upon delimiters in the string
10 * 
11 * @param line The string to be split
12 * @param delim a string specifying the delimeter used to split apart the string
13 * @return a vector of string objects
14 */
15std::vector<std::string> splitLine(std::string line, std::string delim) {
16    std::vector<std::string> result;
17
18    auto start = 0U;
19    auto end = line.find(delim);
20    while (end != std::string::npos) {
21        result.push_back(line.substr(start, end - start));
22        start = end + delim.length();
23        end = line.find(delim, start);
24    }
25    std::string last = line.substr(start, end);
26    if (last.length()) {
27        result.push_back(last);
28    }
29
30    return result;
31}
32
33int main(int argc, char *argv[]) {
34    std::vector<std::string> samples = {"", "java:c:c++:Python:Pascal:Basic", "C++", "C++:Java:"};
35    std::vector<int> expected_sizes = {0,6,1,2};
36    std::string delim = ":";
37
38    for (size_t i = 0U; i< samples.size(); i++) {
39        std::string line = samples[i];
40        std::cout << "Testing: " << line << std::endl;
41        std::vector<std::string> splits = splitLine(line,delim);
42        for (std::string item: splits) {
43            std::cout << item <<  "\n";
44        }
45        if (splits.size() == expected_sizes[i]) {
46            std::cout << "result size matched" <<  "\n";
47        }
48        else {
49            std::cout << "ERROR - result size unexpected, expected: " << expected_sizes[i] <<  "\n";
50        }
51        std::cout << "Finished: " << line <<  "\n";E
52    }
53    
54}