Vectors

7. Vectors#

One of the advantages Python offers is the many built-in data structures - lists, dictionaries, and many others.C++ offers the Standard Template Library (STL). While the class explores the STL library more in-depth later, we’ll first start with the vector class that provides similar functionality to Python’s list, but with the requirement that only objects of a single type can be stored within it. Behind the scenes, vector is a dynamic array implementation that stores elements sequentially. Arrays are contiguous sequences of memory that hold objects of a common type. Unlike regular arrays, vectors can grow and shrink their memory footprint and track their actual size.

The program below demonstrates using the vector class. In line 11, we declare (and initialize) an empty vector items that holds values of type string. The < > syntax identifies a template in C++. We can use different types to create vectors that hold different types such as double and int. While we learn how to create template types later, just realize that you can use any type there.

//filename: vector.cpp
//complile: g++ -std=c++17 -o vector vector.cpp
//execute: ./vector
#include <iostream>
#include <vector>              // Include the vector header
#include <string>
#include <algorithm>           // Will use to sort the list

using std::cin, std::cout;  // bring in these names from the std namespace
using std::string, std::vector;        

int main(int argc, char *argv[]) {
    string item;
    vector<string> items;    // creates an empty vector;

    cout << "Enter a series of strings to add to the vector, ctrl-d to stop: ";
    while (cin >> item) {
        items.push_back(item);
    }

    cout << "Number of items: " << items.size() << "\n";

    std::sort(items.begin(), items.end());         // begin() and end() are iterators - see below

    cout << "Iterate over the list with a for-each loop :" <<  "\n";
    for (string s: items) {
        cout << s <<  "\n";
    }

    return EXIT_SUCCESS;
}

7.1. Declare and Initialize#

C++ provides several different ways to declare and initialize vectors:

//filename: declare.cpp
//complile: g++ -std=c++17 -o declare declare.cpp
//execute: ./declare
#include <vector> 
using std::vector; 

int main(int argc, char *argv[]) {
    int n = 4;
    vector<int> vec;                         // empty vector holding objects of type int.
    vector<int> five{1, 2, 3, 4, 5};         // vector contains as many elements as there are 
                                             // initializers (values with { } on the right-hand side)
    vector<int> alt_five = {1, 2, 3, 4, 5};  // equivalent to the above declaration
    vector<int> another(alt_five);           // another has a copy of each element of alt_five
    vector<int> v1(n);                       // vector has n copies of a value-initialized object.
                                             // (would be the default constructor or zero for numbers)
    vector<int> v2(n,5);                     // holds n copies of the value 5.
}

While it may be tempting to immediately create a vector of a specific size if you know that size, you need to be wary that behind the scenes, C++ constructs objects and places them into the vector. Instead, you should declare the vector, and then call reserve with the appropriate size. The vector library, though, is very efficient at growing. The following code demonstrates creating a vector filled with random numbers and shows how the capacity doubles as additional memory space is needed.

//filename: random.cpp
//complile: g++ -std=c++17 -o random random.cpp
//execute: ./random
#include <random>
#include <iostream>
#include <vector>    
#include <string>
using std::cout, std::vector;  

int main() {
    std::random_device rd;  // Random engine
    std::mt19937 eng(rd());   // Mersenne Twister engine seeded with rd().  Can fix the seed with an int
    
    // Create an object that produces random numbers from the Uniform distribution between 1 and 1000
    std::uniform_int_distribution<> distr(1, 1000);  

    int random_integer = distr(eng);
    cout << "Random number: " << random_integer <<  "\n";

    vector<int> items; 
    int capacity_increases = 0;
    int last_capacity = 0; 
    for (int i=1; i < 1000; i++) {
        items.push_back( distr(eng));  // add a random number to the vector
        if (items.capacity() != last_capacity) { // check if the vector capacity increases
            capacity_increases++;
            last_capacity = items.capacity();
            cout << i << " " << last_capacity <<  "\n";
        }
    }
    cout << "Total number of increases: " << capacity_increases <<  "\n";

    return 0;
}

7.2. Accessing Elements#

Use either the [] operator or the at() method to access elements.

The [] operator is slightly faster as it does not perform bounds checking. However, this could lead to undefined behavior when accessing memory not appropriately initialized. This method is preferred when the boundary sizes are known (which should be almost always). The at() method performs bounds checking and will throw a std::out_of_range exception if you attempt to access an element at an invalid index. This allows you to handle the error gracefully using a try-catch block.

C++ also provides special methods front() and back() to access the ends of the vector. We could also explicitly use 0 and size()-1.

//filename: access.cpp
//complile: g++ -std=c++17 -o access access.cpp
//execute: ./access
#include <vector> 
#include <iostream>
using std::vector; 

int main(int argc, char *argv[]) {
    vector<int> vec{1, 2, 3, 4, 5}; 

    // note the use of size_t as the type for i: https://en.cppreference.com/w/cpp/types/size_t
    for (size_t i=0; i < vec.size(); i++) {
        std::cout << vec[i] << " - " << vec.at(i) <<  "\n";
    } 

    //DANGER - demonstrating how the [] does not perform bounds checking.
    std::cout << vec[10] << std::endl;

    // Using at() - With bounds checking
    try {
        std::cout << vec.at(7) << std::endl;    // throws an exception
    } catch(const std::out_of_range& e) {
        std::cout << "Out of Range error: " << e.what() <<  "\n";
    }

    std::cout << "First element: " << vec.front() << " -- " << vec[0] <<  "\n";
    std::cout << "Last element: " << vec.back() << " -- " << vec[vec.size()-1]  <<  "\n";
}

7.3. Adding and Removing Elements#

C++ provides several methods to add and remove elements. We’ve already demonstrated push_back() which adds an element to the end of the list. pop_back() removes an element from the end of a list, but does not return a value. __insert(_pos_, _element_)__inserts an element at the specified position. Rather than requiring an integral value, we’ll need to use an iterator value - which we can access with either begin() or end(). Then we can use arithmetic to get to the correct position. erase(_pos_) removes an element at the specified position. You can also specify a range of elements to remove. We’ll need to provide two iterators to erase() function: the iterator pointing to the first element to be removed and the iterator pointing to one past the last element to be removed - that is, a range of elements [first, last). clear() removes all elements from the vector.

//filename: insert.cpp
//complile: g++ -std=c++17 -o insert insert.cpp
//execute: ./insert
#include <vector> 
#include <iostream>
using std::vector; 

// Defining a templated function so that we can print vectors of any type
template<typename T>
void printVector(const std::vector<T>& vec) {
    std::cout << "{ ";
    bool first = true;
    for (const auto& elem : vec) {   // This is a fencepost loop.  Special case the first
        if (first) {
            std::cout << elem;
            first = false;
        }
        else {
            std::cout << ", " << elem;
        }
    }
    std::cout << " }" <<  "\n";
}

int main(int argc, char *argv[]) {
    vector<int> vec{1, 2,  4, 5}; 

    vec.insert(vec.begin() +2 ,3);
    vec.insert(vec.end(),8);
    vec.insert(vec.end()-1,7);   // what happens if you try vec.end() + 1 
    printVector(vec);

    // Insert multiple copies of an element
    vec.insert(vec.begin(), 2, 0); // Insert two '0's at the start
    printVector(vec);

    // remove the first element
    vec.erase(vec.begin());
    printVector(vec);

    // remove the last two elements
    vec.erase(vec.end()-2,vec.end());
    printVector(vec);

    vec.clear();
    printVector(vec);
}

7.4. Size and Capacity#

Unlike using native arrays directly, we can obtain the current size (number of elements currently in the vector) and the capacity (the number of elements the vector can store without having to allocate more memory).

//filename: capacity.cpp
//complile: g++ -std=c++17 -o capacity capacity.cpp
//execute: ./capacity
#include <vector> 
#include <iostream>
using std::vector; 


int main(int argc, char *argv[]) {
    vector<int> vec{1, 2, 3};

    std::cout << "Initial size: " << vec.size() <<  "\n";
    std::cout << "Initial capacity: " << vec.capacity() <<  "\n";

    vec.push_back(4);  // note the doubling of the capacity
    std::cout << "Size: " << vec.size() <<  "\n";
    std::cout << "Capacity: " << vec.capacity() <<  "\n";
} 

While not shown in the above code, both size() and capacity() generally return an unsigned long as the result type. (Technically, the exact type is vector<int>::size_type.) While in many cases you can use an int, depending upon compiler options, this may cause a warning and/or error. Generally, you should use size_t or vector<elementType>::size_type as the type. This page demonstrates this in the first loop under “Accessing Elements”. https://en.cppreference.com/w/cpp/types/size_t

    // note the use of size_t as the type for i
    for (size_t i=0; i < vec.size(); i++) {
        std::cout << vec[i] << " - " << vec.at(i) <<  "\n";
    } 

7.5. Iterators#

This page has already demonstrated iterating through a vector using a for-each loop as well as a standard for loop with indexes. We presented iterator variables as we added and removed items with insert() and erase(). These iterators give us indirect access to an object - an element in a container such as a vector. We can use an iterator to fetch an element by using the dereference operator, *, from the specified location in the iterator variable. Unsurprisingly, we can use these variables to move through a vector. We can use arithmetic to move forward and backward in the vector with these iterator variables. Use begin() and end() to perform necessary bounds checking. A valid iterator signifies an element or a position just beyond the container’s last element. All other iterator values are invalid. Note the use of the auto keyword to define the iterator type. Except in extremely rare circumstances, knowing the exact type is unnecessary.

//filename: iterate.cpp
//complile: g++ -std=c++17 -o iterate iterate.cpp
//execute: ./iterate
#include <vector> 
#include <iostream>
using std::vector; 


int main(int argc, char *argv[]) {
    vector<int> vec{1, 2, 3};

    for (auto it = vec.begin(); it != vec.end(); it++) {
        std::cout << *it << ' ';
    }
    for (auto it = vec.end(); it >= vec.begin(); it--) { // note the comparison
        std::cout << *it << ' ';
    }
    std::cout << std::endl;

}

7.6. Memory Management#

One of the great things about using vector is not having to worry about memory. The class makes sensible decisions for us. However, based on specialized tasks (such as declaring a large vector that we know we’ll use or freeing up space from a large vector that’s no longer being used), we can call specific functions to reserve memory or to free memory.

resize(n) changes the size of the vector to n.
reserve(n) reserves at least n elements worth of memory.
shrink_to_fit() reduces memory usage by freeing unused space.

//filename: memory.cpp
//complile: g++ -std=c++17 -o memory memory.cpp
//execute: ./memory
#include <vector> 
#include <iostream>
using std::vector; 


int main(int argc, char *argv[]) {
    vector<int> vec{1, 2, 3};

    std::cout << "1 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";

    vec.reserve(10);  // increase capacity to 10
    std::cout << "2 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";

    vec.push_back(4);
    vec.shrink_to_fit();
    std::cout << "3 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";

    vec.resize(2); // truncates to just the first two elements
    std::cout << "4 - size: " << vec.size() << ", capacity: " << vec.capacity() <<  "\n";

}

7.7. Combining functions, strings, and vectors#

Now that we have covered many of the basics in C++, we can start to develop useful, reusable functions. Below, we take the split code from the string docable and then encapsulate it into a function. Within main, we’ve written a small series of test cases to validate our function. You should comment lines 23 and 25 (so the last string is always added to the result) to see how things change.

//filename: parseLine.cpp
//complile: g++ -std=c++17 -o parseLine parseLine.cpp
//execute: ./parseLine
#include <iostream>
#include <string>
#include <vector>

/**
 * @brief Split apart a line based upon delimiters in the string
 * 
 * @param line The string to be split
 * @param delim a string specifying the delimeter used to split apart the string
 * @return a vector of string objects
 */
std::vector<std::string> splitLine(std::string line, std::string delim) {
    std::vector<std::string> result;

    auto start = 0U;
    auto end = line.find(delim);
    while (end != std::string::npos) {
        result.push_back(line.substr(start, end - start));
        start = end + delim.length();
        end = line.find(delim, start);
    }
    std::string last = line.substr(start, end);
    if (last.length()) {
        result.push_back(last);
    }

    return result;
}

int main(int argc, char *argv[]) {
    std::vector<std::string> samples = {"", "java:c:c++:Python:Pascal:Basic", "C++", "C++:Java:"};
    std::vector<int> expected_sizes = {0,6,1,2};
    std::string delim = ":";

    for (size_t i = 0U; i< samples.size(); i++) {
        std::string line = samples[i];
        std::cout << "Testing: " << line << std::endl;
        std::vector<std::string> splits = splitLine(line,delim);
        for (std::string item: splits) {
            std::cout << item <<  "\n";
        }
        if (splits.size() == expected_sizes[i]) {
            std::cout << "result size matched" <<  "\n";
        }
        else {
            std::cout << "ERROR - result size unexpected, expected: " << expected_sizes[i] <<  "\n";
        }
        std::cout << "Finished: " << line <<  "\n";E
    }
    
}