Data Structure: Arrays and Lists (I)

Programming with C++

YaLinChen (Amber)
6 min readMar 1, 2022
Source: Unsplash

Arrays and lists are the most common and fundamental data structures. These two kinds of structures are linear data structures.

Other data structures include trees, tables, graphs, etc.

Simply put, an array is “contiguous”, meaning that it stores all the data elements together. On the other hand, a list is “linked” since it does not store all the elements together, rather, it stores the elements separately with “pointers” that associate them. Furthermore, a list can work forward or bidirectionally.

An illustration of an array and a signly linked list. The blue boxes represent the data elements and the purple boxes represent the corresponding memory locations. In the array, the locations are continuous, whereas in the list, the locations are not continuous and can be totally random.

All these different types of features can be applied to corresponding problems, and choosing the right structure for your problem is so fundamental to your task. A useful metric to determine the appropriateness of a data structures is algorithmic complexity, also called time complexity.

Time complexity indicates the relative amount of time required, in proportion to the size of the data, to perform a certain operation.” Intuitively, the less the time it takes to manipulate the same amount of data is preferred.”

Big O notation of time complexity

Often we see the notation O(x) to represent the time complexity of a data structure or algorithm. This is a notation for users to know to execute a task, how much time the algorithm should take for an amount of data.

For example, an O(1) means the time remains regardless the data volume; access to an element in an array is O(1) (see below). O(n) means the time is proportional to the data volume; insertion of an element to an array is O(n) (see below)

There are more time complexity other than O(1) and O(n). Some are efficient and some are not. It depends on what performance metric of interest and the characteristics of the data/problem you have.

Array

An array is a storage of data elements in contiguous memory locations. Together, these elements are stored in a big chunk of memory with corresponding allocated memory for each. In other words, we can simply use a linear formula Location of Data[n]= Location of Data[0]+size of each element*n.

An illustration of array memory

Time complexity of an array

  1. Access: The above formula for calculating the location of each element enables us to access a specific element with one step regardless of the number of elements in an array. The time complexity for this task is O(1).
  2. Insertion/deletion: No matter where you want to insert or delete an element, you can only do it “after”. Since all elements are stored in a contiguous space of memory, after the insertion/deletion, we will need to update all locations of the elements behind the data point. The time complexity for this task is O(n).
Inserting an element to an array. The blue boxes represent the data elements and the purple boxes represent the corresponding memory locations. Here we want to insert Data(new) after Data[1]. After the insertion, the memory locations of the elements after Data(new) has to be changed (the orange blocks), and hence this is an O(n) operation.

3. Overwriting an element: overwriting an element needs only change the data of the element without changing the location. The time complexity for this task is O(1).

Static v.s. Dynamic Arrays

A static array is a data structure whose memory is fixed size and determined before the runtime; in other words, once you run the program, there is no way to change the size, such as appending or deleting an element. On the other hand, a dynamic array allows you to change the size during the runtime. This comes in handy, for example, a hospital wants to keep a list of the patient IDs while they do not know in advance how many patients they will receive and this number is constantly changing.

Programming with C++

Static arrays

https://en.cppreference.com/w/cpp/container/array

// static arrays
// for static arrays, we use std::array container
#include <iostream>
#include <array>
using namespace std;
int main()
{
//1.
array<int,10> arr; // declare an array with a size of 10 and elements of integers
arr[0] = 1 // assign integer 1 to the first element (index 0)
cout << "First element: " << arr[0] << endl;
//2.
array<int,4> arr2 = {1, 2, 3, 4}; // declare a second array
cout << "Elements in second array: ";
int arrSize = sizeof(arr2) / sizeof(arr2[0]); //sizeof: to divide the size of the array by the size of each element (in bytes).for (int i = 0; i < arrSize; i++)
{
cout << arr2[i] << " ";
cout << " " << endl;
}
arr2[2] = 10; // overwirte an element
cout << arr2[2] << endl;
//3. sort an array
std::sort(arr2.begin(), arr2.end());
for (int i = 0; i < arrSize; i++)
{
cout << arr2[i] << " ";
}
//4. reverse an array
std::reverse_copy(arr2.begin(), arr2.end(),std::ostream_iterator<int>(std::cout, " "));

Dynamic arrays

https://en.cppreference.com/w/cpp/container/vector

// dynamic arrays
// for dynamic arrays, we use std::vector container
#include <iostream>
#include <vector>
using namespace std;
int main()
{
//1. Knowing your data
vector<int> v1 = {1, 2, 3, 4, 5};
cout << v1[0] << endl;
// the first element (an index method)
cout << v1.front() << endl;
// the first element (iterator method)
cout << v1.back() << endl;
// the last element
cout << v1.size() << endl;
// the number of elements the vector currently holds
cout << v1.capacity() << endl;
// the number of elements the vector can hold
//2. push_back(): pend an element at the end of the vector
v1.push_back(9);
cout << v1.capacity() << endl;
// here it shows 10 since it doubles the "capacity" with the original vector capacity (5). This makes a dynamic array less time costly since it does not have to change the capacity every time a new element is added.
cout << v1.size() << endl;
// here it shows 6
for (int i = 0; i < v1.capacity(); i++)
{
cout << v1[i] << " ";
}
// its shows 1 2 3 4 5 9 0 0 0 0 (using 0 to fill capacity without elements.
// 3. pop_back(): remove the very element from the back of the vector
v1.pop_back();
cout << v1.size() << endl;
// it shows 5
cout << v1.capacity() << endl;
// the capacity remains as 10 even though there are only 5 elements
//4. shrink_to_fit(): to reduce the capacity
v1.shrink_to_fit();
cout << v1.size() << endl;
// it shows 5
cout << v1.capacity() << endl;
// it becomes 5
// 5. insert(): insert elements at specific positions
v1.insert(v1.begin(), 5);
// v1.begin(): returns an iterator to the beginning; 5: the inserted element at the beginning
cout << v1[0] << endl;
// it returns 5
v1.insert(v1.begin() + 1, 3); // insert at specific positions by indicating the position using "v1.begin() + n"cout << v1[0] << endl;
// it returns 5
cout << v1[1] << endl;
// it returns 3; v1.begin() + 1 is the index 1
for (int i = 0; i < v1.size(); i++)
{
cout << v1[i] << " ";
}
// it returns 5 3 1 2 3 4 5
// 6. erase an element
v1.erase(v1.begin() + 2);
for (int i = 0; i < v1.size(); i++)
{
cout << v1[i] << " ";
}
// it returns 5 3 2 3 4 5

Final words…

In the next post, “list” will be covered and compared with arrays. Note that an array may look cumbersome since it stores all the data together. Nevertheless, it offers an advantage, cache locality. “Since all the elements are present next to each other, when one of the elements is accessed, a few elements next to it are also brought into the cache”.

Reference

C++ Data Structures and Algorithm Design Principles: Leverage the power of modern C++ to build robust and scalable applications

--

--