diff --git a/cpp-review/basics/lesson.md b/cpp-review/basics/lesson.md new file mode 100644 index 0000000..f64b905 --- /dev/null +++ b/cpp-review/basics/lesson.md @@ -0,0 +1,647 @@ +# C++ Basics + +**Author:** *Brian Magnuson* + +In the next few lessons, I'll be covering almost everything you need to know about the C++ programming language. C++ is a statically-typed, compiled, general-purpose programming language. Its low-level memory manipulation capabilities and Standard Template Library (STL) make it an ideal language for learning data structures and algorithms. + +These lessons are meant to be a *review* of C++. I'll assume you have some experience with an object-oriented programming language like Java or Python. If you're new to programming, I recommend looking for more beginner-friendly resources on C++. + +If you are ***already familiar with C++***, I would still recommend reading through these lessons as they may contain some useful information for success in this course. + +In this lesson, we will cover: +- The main function +- Preprocessor directives +- Data types +- Variables +- Control structures +- Functions +- Function declarations and header files +- Standard input/output +- Using keyword + +In addition to the above, I will also touch on best C++ practices to help you write well-structured code. + +Notably, I have chosen to omit arrays from this lesson. This and memory management will be covered in the next lesson. + +Let's get started! + +# The Main Function + +If you're building an executable, you will need a main function. Execution begins and ends in the main function. The main function is defined in a file, typically named `main.cpp`. Note that some style guides may recommend using `.cc` or `.cxx` as the file extension for C++ source files. + +**For this course, we use `.cpp` for source files and `.h` for header files.** + +A main function looks like this: +```cpp +int main() { + // Your code here + return 0; +} +``` + +The main function returns an integer. A return value of `0` indicates that the program executed successfully. Any other value indicates an error. You can also use `std::exit` from the `` header to exit from anywhere in the program. + +A main function can also look like this: +```cpp +int main(int argc, char* argv[]) { + // Your code here + return 0; +} +``` + +You may write a main function like this if you want to accept command-line arguments. `argc` is the number of arguments passed to the program, and `argv` is an array of strings containing the arguments. When there are command-line arguments, the string that invoked the program is stored in `argv[0]`, and the arguments are stored in `argv[1]`, `argv[2]`, and so on. + +# Preprocessor Directives + +C++ programs are built in three stages: preprocessing, compiling, and linking. Preprocessing is the first stage. The preprocessor reads the source code and processes preprocessor directives, which start with a `#` symbol. This is all done *before* compilation. + +The `#define` directive creates a macro. During preprocessing, all instances of the macro are replaced with the value or expression. Macros are used to define constants or to create simple functions. +```cpp +#define MACRO_NAME_1 value +#define MACRO_NAME_2(VALUE) VALUE + 1 +``` + +Macros are often written in `UPPER_SNAKE_CASE` to distinguish them from variables. + +You can also define a macro without a value. This is often used to create a *flag* for conditional compilation. +```cpp +#define FLAG +``` + +The `#include` directive is used to include files. During preprocessing, the contents of the file are copied and pasted where the directive is located. +```cpp +#include +#include "my_header.h" +``` + +Using quotes tells the preprocessor to check files relative to the current file, then check the system directories. Using angle brackets tells the preprocessor to check the system directories only. Typically, you will use quotes for your own header files and angle brackets for system header files. + +The `#ifdef`, `#ifndef`, `#else`, and `#endif` directives are used for conditional compilation. These directives are used to include or exclude code based on whether a macro is defined. +```cpp +#ifdef FLAG + // Code here +#else + // Code here +#endif +``` + +These are often useful when you want to compile code for a specific platform or when you want to include debugging code that you can easily turn off. + +The `#ifndef` directive is particularly useful for creating an **include guard**. An include guard prevents files from being compiled more than once (which can cause "multiple definition" errors). All header files should have an include guard as they are often included in multiple files. +```cpp +// my_header.h +#ifndef MY_HEADER_H +#define MY_HEADER_H + +// Header code here + +#endif +``` + +You can choose any name for the macro, but it should be unique to the file. Google's style guide recommends the format `___H_`. For smaller projects, simply naming the macro after the file is sufficient (as shown above). + +These are most of the directives you will encounter in C++. There is, however, one more: the `#pragma` directive. This directive is used to give instructions to the compiler. However, the exact behavior of `#pragma` is compiler-dependent. + +One common use of `#pragma` is to act as a simpler include guard. +```cpp +// my_header.h +#pragma once +// No need to use #ifndef MY_HEADER_H + +// Header code here +``` + +GCC, Clang, and MSVC all support the `#pragma once` directive, which tells the compiler to only include the file once. We accept either method in this course (though I have a slight preference for `#ifndef`). + +# Data Types + +C++ has a variety of data types for integers. Confusingly, the size of these data types may vary depending on the platform. + +- `char`: At least 8 bits, but all architectures use 8 bits. + - Values of the `char` data type are printed as characters. 8 bits is enough to represent all ASCII characters. + - Range: -128 to 127 for signed, 0 to 255 for unsigned. + - May be signed or unsigned by default depending on the platform. +- `short` or `short int`: At least 16 bits, but all architectures use 16 bits. + - Range: 16 bits is -32,768 to 32,767 for signed, 0 to 65,535 for unsigned. +- `int`: At least 16 bits, but most architectures use 32 bits. + - Range: 32 bits is -2,147,483,648 to 2,147,483,647 for signed, 0 to 4,294,967,295 for unsigned. +- `long` or `long int`: At least 32 bits. Some architectures use 32 bits. Unix-like systems with a 64-bit architecture use 64 bits. + - Range: 64 bits is -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 for signed, 0 to 18,446,744,073,709,551,615 for unsigned. +- `long long` or `long long int`: At least 64 bits, but all architectures use 64 bits. + +All integer types (except `char`) are signed by default. To make them unsigned, add the `unsigned` keyword. For example, `unsigned int`. + +If you want to guarantee a specific size, the `` header provides fixed-width integer types. However, for this course, using `char`, `int`, and `long long` is usually sufficient. + +For floating-point numbers, C++ has two data types: +- `float`: Single precision, IEEE-754 standard. + - 32 bits. +- `double`: Double precision, IEEE-754 standard. + - 64 bits. + +Numeric literals with a decimal point are treated as `double` by default. To specify a `float`, add an `f` or `F` at the end of the number. E.g., `3.14f`. + +Notably, both IEEE-754 floating point types also support positive infinity, negative infinity, `NaN` (not a number), and negative zero. These can be useful for some calculations. Just be aware that any comparison with `NaN` will always return `false`, including with itself. + +Be careful when comparing floating-point numbers. Due to the way they are stored in memory, floating-point numbers may not be exactly equal. Instead, compare the difference between the two numbers to a small value (often called epsilon). E.g., `fabs(a - b) < 1e-9`. + +Next, there is the `bool` data type. It is almost always 1 byte in size. The values are `true` and `false`. `true` is equivalent to 1, and `false` is equivalent to 0. Note that whenever a boolean value is needed (such as in an `if` statement), any value equivalent to 0 is considered `false`, and any other value is considered `true`. `NaN` is considered `true`. + +Finally, there are pointer types (types with a `*`). We'll cover pointers in the next lesson, but for now, know that pointers are used to store memory addresses. The size of a pointer depends on the architecture. For 32-bit systems, pointers are 4 bytes. For 64-bit systems, pointers are 8 bytes. + +To read more about data types in C++, see [this page](https://en.cppreference.com/w/cpp/language/types). + +# Variables + +Variables can be given any valid identifier. An identifier is a sequence of letters, digits, and underscores that does not start with a digit. Many programmers use `snake_case` or `lowerCamelCase` for variable and function names, though I have a preference for `snake_case`. You can use any style you like, but ***be consistent***. + +A local variable in C++ can be declared like this: +```cpp +int variable_name; +int variable_name = 1; +auto variable_name = 1; +``` + +`auto` is not a data type nor a keyword for a dynamic type (C++ is statically typed). Rather, it tells the compiler to infer the data type from the value assigned to the variable. This is useful when the data type is long (like `std::unordered_map>::iterator`). + +In rare cases, you may want to use `decltype` to use the type of another variable. +```cpp +decltype(variable_name) another_variable_name = 2; +``` + +You can use the `const` keyword to make a variable constant. This means the variable cannot be changed after it is initialized. +```cpp +const int constant_name = 1; +``` + +You can use `constexpr` to make a variable constant at compile time. Some features of C++ such as array sizes require compile-time constants. +```cpp +constexpr int compile_time_constant = 1; +``` + +We recommend against *creating global variables*. Global variables can be accessed from any function, which can make it difficult to track down bugs. If you need to share data between functions, consider using function arguments or returning values. + +To create a reference to a variable, use the `&` symbol. +```cpp +int variable_name = 1; +int& reference_name = variable_name; +reference_name = 2; // also changes variable_name +``` + +References must always be initialized when declared, and the referenced variable cannot be changed. References are often used with range-based `for` loops and function arguments to avoid expensive copies. For other uses, it is best to use references with local variables. + +It is possible to use references in structs and classes, but without guaranteeing the lifetime of the referenced variable, it can potentially lead to undefined behavior, so it is best to avoid this. + +References can be thought of as a special kind of pointer. We will cover pointers and memory management in the next lesson. + +# Control Structures + +These, you probably already know. But let's review them anyway. + +The `if` statement is used to execute code based on a condition. +```cpp +if (condition) { + // Code here +} + +if (condition) { + // Code here +} else if (other_condition) { + // Code here +} else { + // Code here +} +``` + +The `switch` statement is used to execute code based on the value of an expression. +```cpp +switch (expression) { + case value_1: + // Code here + break; + case value_2: + // Code here + break; + default: + // Code here +} +``` +Unlike other languages, the expression in a `switch` statement must be an integer or an enumeration. Floating-point numbers, strings, and other types are not allowed. + +The `while` statement is used to execute code while a condition is true. +```cpp +while (condition) { + // Code here +} +``` + +The `do-while` statement is used to execute code at least once and then while a condition is true. +```cpp +do { + // Code here +} while (condition); +``` + +The `for` statement allows you to initialize a variable, check a condition, and update the variable in one line. It is often used to iterate a specific number of times. +```cpp +for (int i = 0; i < 10; ++i) { + // Code here +} +``` + +I won't be covering most operators in this lesson, but I would like to bring special attention to the `++` and `--` also known as the increment and decrement operators. These operators can be used as a prefix or postfix. The difference is that the prefix operator increments or decrements the variable before the value is used, while the postfix operator increments or decrements the variable after the value is used. For example, `++i` increments `i` and then uses the new value, while `i++` uses the current value of `i` and then increments it. + +The range-based `for` statement is used to iterate over a range of elements. It is often used with arrays, vectors, and other containers. +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +for (int i : vec) { + // Code here +} +for (int& i : vec) { + // Code here +} +``` + +You can also use `auto` and `auto&` instead of writing the data type explicitly. + +For all kinds of loops, you can use `break` to exit the loop early and `continue` to skip the rest of the loop and go to the next iteration. + +# Functions + +Here are examples of functions in C++: +```cpp +int add(int a, int b) { + return a + b; +} + +void print_result(int result) { + std::cout << result << std::endl; +} +``` + +The return type is written first, then the function name, then the parameters in parentheses. If the function does not return a value, use `void`. For non-void functions, all code paths should always return a value. + +You can use `...` as the last parameter in a function to create a variadic function. These can be useful at times, but I would recommend against using them for this course. You can read more about variadic functions [here](https://en.cppreference.com/w/cpp/language/variadic_arguments). + +In C++, you can declare multiple functions with the same name but different parameters. This is called **function overloading**. The compiler will choose the correct function based on the parameters passed to it. +```cpp +int add(int a, int b) { + return a + b; +} +double add(double a, double b) { + return a + b; +} +``` + +When you pass an argument to a function, the default behavior is to *copy* the argument. If the object is defined by a class or struct, the copy constructor is called. For example, if you pass a `std::vector` to a function, the entire vector is copied. +```cpp +void foo(std::vector vec) {...} +``` + +This can be expensive for large objects. If you don't need a copy of the object, you can instead pass it by reference or `const` reference. +```cpp +void foo(std::vector& vec) {...} +void foo(const std::vector& vec) {...} +``` + +Using `const` is like making a promise to callers that you won't modify the object. This can help prevent bugs and make your code easier to understand. + +# Function Declarations and Header Files + +This next part is ***important***, especially for C++ beginners, so pay attention! + +You ***cannot use a function before it is declared***. This is because C++ reads files from top to bottom. If you try to use a function before it is declared, the compiler will throw an error. +```cpp +// In ping_pong.cpp +void ping(int n) { + if (n == 0) return; + pong(n - 1); // Error: pong is not declared +} + +void pong(int n) { + if (n == 0) return; + ping(n - 1); +} +``` + +To fix this, you can declare the function before it is used. This function declaration is called a **prototype**. +```cpp +// In ping_pong.cpp +void pong(int n); + +void ping(int n) { + if (n == 0) return; + pong(n - 1); +} + +void pong(int n) { + if (n == 0) return; + ping(n - 1); +} +``` + +If we want to use the `ping` and `pong` functions in another file, we can create a header file. Header files typically have the `.h` extension. +```cpp +// In ping_pong.h +#ifndef PING_PONG_H +#define PING_PONG_H + +void pong(int n); + +void ping(int n) {...} +void pong(int n) {...} +// Not recommended, but allowed + +#endif +``` + +Although we could include the function definitions in the header file, this can potentially slow down compilation, especially for large projects. For best practices, we should only include the function declarations in the header file and put the function definitions in a source file (typically with a `.cpp` extension). +```cpp +// In ping_pong.h +#ifndef PING_PONG_H +#define PING_PONG_H + +void ping(int n); +void pong(int n); + +#endif +``` +```cpp +// In ping_pong.cpp +#include "ping_pong.h" + +void ping(int n) {...} +void pong(int n) {...} +``` + +This is the ***recommended way*** to structure your C++ code. An exception to this rule is when you are using templates. Template functions must be defined in the same header file they are declared in. This is mainly a quirk of how templates work in C++. + +We can then include the header file in the source file where we write the function definitions. +```cpp +// In ping_pong.cpp +#include "ping_pong.h" + +void ping(int n) { + if (n == 0) return; + pong(n - 1); +} + +void pong(int n) { + if (n == 0) return; + ping(n - 1); +} +``` + +Then, if we want to use the `ping` and `pong` functions in another file, we can include the header file. +```cpp +// In main.cpp +#include "ping_pong.h" + +int main() { + ping(10); + return 0; +} +``` + +The main function will be able to use the `ping` and `pong` functions as long as `main.cpp` and `ping_pong.cpp` are compiled and linked together. + +For many C++ project, you'll find that files often come in pairs: a `.cpp` file with the function definitions and a `.h` file with the function declarations. This is a common practice in C++ and is recommended for this course. + +Remember that the C++ preprocessor will check file *relative* to the current file. So if `ping_pong.h` were located in a different directory, say `my_lib`, you would write `#include "my_lib/ping_pong.h"`. You do not need to do this for system header files because the preprocessor checks a specific set of directories where these files are located. You can technically add your own directories to this list using the `-I` flag during compilation, but this is recommended only if you know what you are doing. + +You should ***never*** use `#include` to include a `.cpp` file. This can lead to multiple definitions of functions and variables, which will cause errors. Instead, use header files to declare functions and variables. + +# Standard Input/Output + +C++ uses `std::cin` for input and `std::cout` for output. To use these, you need to include the `` header. +```cpp +#include + +int main() { + int n; + std::cin >> n; + std::cout << n << std::endl; + return 0; +} +``` + +C++ will automatically convert the input to the correct data type. If the input is not a valid number, `std::cin` will set the fail bit, and you can check this with `std::cin.fail()`. You can also use `std::cin.clear()` to clear the fail bit and `std::cin.ignore()` to ignore the invalid input. + +If the input is invalid, `n` will not be assigned. In the example above, it will remain uninitialized, which can lead to undefined behavior. + +Most programming problems in this course use a strict input format (which we specify), so you usually will not need to validate the input. + +`std::endl` is used to insert a newline character and flush the output buffer. You can also use `'\n'` to insert a newline character without flushing the buffer. It is worth noting that in most command-line environments, the output buffer is automatically flushed when a newline character is printed, so `std::endl` is not necessary. You may use either. + +If you want to get an entire line as input, you can use `std::getline`. +```cpp +#include +#include + +int main() { + std::string s; + std::getline(std::cin, s); + std::cout << s << std::endl; + return 0; +} +``` + +You can also specify the delimiter for `std::getline` if you want to read until a specific character. + +# Using Keyword + +The last thing I want to cover in this lesson are `using` declarations. The `using` keyword can be used for creating aliases. However, I'd like to focus on what a lot of beginners use with `using`: `using namespace std`. + +You might've noticed that the last few examples have `std::` in front of several functions, types, and objects. This is because these are part of the `std` namespace. The `std` (standard) namespace is a collection of functions, types, and objects provided by the C++ Standard Library. + +Beginners will often use `using namespace std;` at the top of their code. This allows the use of standard functions and types without the `std::` prefix. +```cpp +#include +using namespace std; + +int main() { + cout << "Hello, World!" << endl; + return 0; +} +``` + +Many style guides recommend against this as it can lead to naming conflicts. + +Perhaps a slightly safer alternative is to use `using` declarations. +```cpp +#include +using std::cout; +using std::endl; + +int main() { + cout << "Hello, World!" << endl; + return 0; +} +``` + +Here are ***my recommendations for this course***: + +You may use `using namespace std;` for any programming quiz, pseudocode problem, interview question, or project in this course. It is generally acceptable to use it in small projects. + +Even I used it plenty of times before, but I've since outgrown it and have adapted to putting `std::` in front of everything. Future lessons may include it, others may not. + +# Practice + +Here are some practice problems to help you get comfortable with the concepts covered in this lesson. + +What is the output of the following code? +```cpp +#include +int main() { + int a = 1; + int b = 2; + int c = a / b; + std::cout << c << std::endl; + return 0; +} +``` +- 0 +- 0.5 +- 1 +- (produces a compile error) +- (result is undefined) + +
+Answer +0 +
+ +--- + +What is the output of the following code? +```cpp +#include +int main() { + int a = 1; + int& b = a; + b = 2; + std::cout << a << b << std::endl; + return 0; +} +``` +- 11 +- 12 +- 22 +- 21 +- (produces a compile error) + +
+Answer +22 +
+ + +--- + +What is the output of the following code? +```cpp +#include +int main() { + int n = 8; + while (n) { + n /= 2; + std::cout << n; + } + return 0; +} +``` +- 421 +- 4210 +- 8421 +- (produces a compile error) +- (program does not terminate) + +
+Answer +4210 +
+ +--- + +What is the output of the following code? +```cpp +#include +int main() { + int n = 0; + std::cout << n++ << n++ << ++n << std::endl; + return 0; +} +``` +- 011 +- 012 +- 013 +- 123 +- 124 + +
+Answer +013 +
+ +--- + +What is the output of the following code? +```cpp +#include +int main() { + int n = 5; + for (int i = 1; i < n; ++i) { + if (i % 2 == 0) continue; + std::cout << i; + } + return 0; +} +``` +- 12345 +- 135 +- 13 +- 1 +- (nothing is printed) + +
+Answer +13 +
+ +--- + +What is the output of the following code? +```cpp +#include +void foo(int n, int& m) { + int temp = n; + n = m; + m = temp; +} +int main() { + int a = 1; + int b = 2; + foo(a, b); + std::cout << a << b << std::endl; + return 0; +} +``` +- 11 +- 12 +- 21 +- 22 +- (produces a compile error) + +
+Answer +11 +
+ +# Conclusion + +That's it for this lesson! We've covered a lot of ground, but this is just the beginning! In the next lesson, we'll cover arrays and memory management. + +# References + +- [C++ Reference](https://en.cppreference.com/w/cpp) +- [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html) diff --git a/cpp-review/classes/image.png b/cpp-review/classes/image.png new file mode 100644 index 0000000..45a3fca Binary files /dev/null and b/cpp-review/classes/image.png differ diff --git a/cpp-review/classes/lesson.md b/cpp-review/classes/lesson.md new file mode 100644 index 0000000..91033ab --- /dev/null +++ b/cpp-review/classes/lesson.md @@ -0,0 +1,629 @@ +# C++ Classes + +**Author:** *Brian Magnuson* + +In this lesson, we'll continue where we left off in our C++ review. +We'll be discussing classes, which are a fundamental part of object-oriented programming (OOP). + +In this lesson, we will cover: +- Class basics +- Constructors +- Member functions +- Copy constructors and copy assignment operators +- Destructors +- Static members +- Inheritance +- Abstract classes + +# Class Basics + +A class is usually defined in a header file, and its function implementations are defined in a separate source file. Some people prefer to define each class in its own header/source file pair. This can help keep your code organized and make it easier to find things. + +Here's an example of a simple class definition: +```cpp +class MyClass { + int z; // private by default +public: + int x; + int y; +}; +``` +Here, our class has two public member variables, `x` and `y`. These variables can be accessed and modified from outside the class. + +To use our class, we can create an instance of it like this: +```cpp +MyClass obj; +obj.x = 5; +obj.y = 10; + +std::cout << obj.x << " " << obj.y << std::endl; // Output: 5 10 +``` +The `.` operator is used to access the members of an object. Outside of the class definition, we can only access public members, i.e., members declared after the `public:` access specifier. + +By default, members of a class are private, meaning they can only be accessed from within the class itself. To make members public, we use the `public:` access specifier. We can also use `private:` to declare private members after a public section. + +There is another way to declare a custom composite type in C++: using a `struct`. The *only* difference between a `class` and a `struct` is that members of a `struct` are public by default, while members of a `class` are private by default. +```cpp +struct MyStruct { + int x; // public by default + int y; // also public by default +private: + int z; +}; +``` + +By specifying which members we want to make public or private, we can control how our class is used. This feature is known as **encapsulation**. Encapsulation allows us to hide certain features of our class from the outside world, preventing unwanted access and modification. + +For your C++ projects in this course, you will be expected to utilize encapsulation to protect your class members. + +# Constructors + +Suppose we have the following classes: +```cpp +class Vector2D { +public: + int x = 0; + int y = 0; +}; +class Character { +public: + Vector2D position; + int health; + int power; +}; +``` + +We can instantiate a `Character` in one of these ways: +```cpp +Character player_1 = Character(); +Character player_2; // More concise +``` + +Both of these methods call the class's **default constructor**. A **constructor** is a special member function that instantiates an object of a class. A default constructor is a constructor that takes no arguments. + +Since we didn't define any constructor for `Character`, C++ provides a default constructor for us. The automatic default constructor calls the default constructors of all the class's members, except for primitive types like `int`, which are not initialized. Using `player_1.health` at this point would be undefined behavior. + +To define our own constructor, we can do this: +```cpp +class Character { +public: + Point position; + int health; + int power; + Character() { + health = 100; + power = 10; + } +}; +``` + +Constructors and destructors do not have a return type. + +We can also declare the constructor outside the class definition: +```cpp +class Character { +public: + Point position; + int health; + int power; + Character(); +}; + +Character::Character() { + health = 100; + power = 10; +} +``` + +To define class members outside the class definition, we use the scope resolution operator `::`. This operator is used to define functions that are declared inside a class. Remember, it's a good idea to put your function implementations in a separate source file. + +You can also define your constructor like this: +```cpp +Character::Character() : health(100), power(10) {} +``` + +This method effectively does the same thing, but by using an initializer list like this, you can initialize members that cannot be assigned a value in the constructor body (such as `const` members). + +You can also write constructors that take arguments: +```cpp +class Character { +public: + Point position; + int health; + int power; + Character(int h, int p) : health(h), power(p) {} +}; +``` + +However, if you do not also have a default constructor, C++ will NOT provide one for you. + +To instantiate a `Character` with this constructor, you would use one of these methods: +```cpp +Character player_1 = Character(100, 10); +Character player_2(100, 10); // More concise +``` + +# Member Functions + +Member functions are functions that are part of a class. They can access the class's private members and are called using the `.` operator on an object of the class. + +Here's an example of a class with a member function: +```cpp +class Character { + Point position; + int health = 100; + int power = 10; +public: + void move(int x, int y); +}; + +void Character::move(int x, int y) { + position.x += x; + position.y += y; +} +``` + +And we can use it like this: +```cpp +Character player; +player.move(5, 10); +``` + +Even though `position` is private, the `move` function can access it because it's a member of the `Character` class. + +If `player` happened to be a pointer, we would have to dereference it to call the member function. The syntax for this is a bit messy which is why C++ provides a special operator `->` for this purpose: +```cpp +Character* player = new Character(); +// (*player).move(5, 10); +player->move(5, 10); +``` + +All non-static member functions have access to a special pointer called `this`, which points to the object on which the function was called. Earlier, we did not need it because `position` unambiguously referred to the `position` member. However, if we had a parameter with the same name as a member, we would need to use `this` to access the member: +```cpp +void Character::take_damage(int health) { + this->health -= health; +} +``` + +Notice that we have to use the `->` operator here since `this` is a pointer. + +Both of these member functions, `move` and `take_damage` change, or **mutate**, the state of the object. Sometimes, we don't want to mutate the object. And in some cases, we *can't* mutate the object. +```cpp +const Character player; +// player.move(5, 10); // Error: Cannot call non-const member function on const object +``` + +This helps make sure we don't accidentally change the state of an object when we shouldn't. But not every member function will mutate the object. In these cases, we need to tell the compiler that the function won't change the object's state by using the `const` keyword: +```cpp +bool Character::is_alive() const { + return health > 0; +} +``` + +By marking the function as `const`, we disallow any modifications to `this` inside the function. We can then call this function on a `const` object: +```cpp +const Character player; +player.is_alive(); // OK +``` + +It is good practice to mark member functions as `const` if they do not modify the object's state. + +On rare occasions, you will want to create a member function that returns a reference to a member variable for the caller to access and possibly modify. In these cases, you may need both a `const` and non-`const` version of the function. This is the case for many STL containers, like `std::vector`. + +# Copy Constructors and Copy Assignment Operators + +There are two ways to create a copy of an object in C++: the copy constructor and the copy assignment operator. +```cpp +Character player_1; +Character player_2 = player_1; // player_2 is copy-constructed from player_1 +Character player_3; +player_3 = player_1; // player_3 is copy-assigned from player_1 +``` + +The difference between the two is that one is used to create a new object, while the other is used to assign an existing object. + +The copy constructor and copy assignment operator have these signatures: +```cpp +Character(const Character& other); +Character& operator=(const Character& other); +``` + +If you do not define a copy constructor or copy assignment operator, C++ will provide a default implementation for you. This implementation will simply copy each member of the object one by one. + +If one of your members is a pointer, this default implementation will copy the pointer, not the object it points to. In other words, both objects will point to the same memory location. This is known as a **shallow copy**. This can be problematic if one of your objects will deallocate the memory it points to. The other object will be left with a dangling pointer. +```cpp +class Character { + char* name; +}; +``` + +To fix this, you need to define your own copy constructor and copy assignment operator. This is known as a **deep copy**. +```cpp +Character(const Character& other) { + name = new char[strlen(other.name) + 1]; + strcpy(name, other.name); +} +Character& operator=(const Character& other) { + if (this != &other) { + delete[] name; + name = new char[strlen(other.name) + 1]; + strcpy(name, other.name); + } + return *this; +} +``` + +Notice how the copy assignment operator has to return `*this`. + +On the rare occasion that you want to prevent copying of your object, you can delete the copy constructor and copy assignment operator: +```cpp +class Character { + Character(const Character& other) = delete; + Character& operator=(const Character& other) = delete; +}; +``` + +You can also do this with the default constructor and destructor. + +There is another special set of member functions called **move constructors** and **move assignment operators**. These are used to transfer ownership of resources from one object to another. We won't cover them in this lesson. + +# Destructors + +A **destructor** is another special member function. It is used to clean up resources that the object has acquired during its lifetime. It is automatically called under the following circumstances: +- If the object is allocated on the stack, the destructor is called when the object goes out of scope. +- If the object is allocated on the heap, the destructor is called when `delete` is called on the object. + +Like the constructor, it has no return type. It also does not take any arguments. +```cpp +class Character { + char* name; +public: + Character(const char* n) { + name = new char[strlen(n) + 1]; + strcpy(name, n); + } + /* copy constructor and copy assignment operator here */ + ~Character() { + delete[] name; + } +}; +``` + +A common rule of thumb is that for the copy constructor, copy assignment operator, and destructor, if you need to define one, you probably need to define all three. + +If you don't define a destructor, C++ will provide a default implementation for you. This implementation will not delete any resources that the object has acquired, which can lead to memory leaks. + +# Static Members + +A **static member** is a member of a class that belongs to the class itself, not any particular instance of the class. Static members are shared among all instances of the class. They are declared with the `static` keyword. +```cpp +class Character { + static int num_characters; // A static member variable +public: + // A static member function + static int get_num_characters() { + return num_characters; + } + /* Constructor, destructor, etc. */ +}; +``` + +Please note that this use of the keyword `static` is different from the use of `static` in a global context (how C users might be familiar with it). + +A static member variable cannot be initialized inside the class definition unless it is a `const` integral type. Instead, you must initialize it outside the class definition. You can do this in the source (`.cpp`) file. +```cpp +int Character::num_characters = 0; +``` + +To access a static member, you use the scope resolution operator `::`, or use an instance of the class. +```cpp +Character::num_characters++; +Character::get_num_characters(); +player.get_num_characters(); // Also valid, but less obvious +``` + +# Inheritance + +Let's say you want to create a new class that is similar to an existing class, but with some additional features. You can use **inheritance** to achieve this. +```cpp +class Player : public Character { + int score; +public: + Player(int h, int p, int s) : Character(h, p), score(s) {} +}; +``` + +Here, we write `Player : public Character`. This means that `Player` inherits from `Character`. This allows `Player` to access all of `Character`'s members and member functions. We can also add new members and member functions to `Player` like `score`. + +We interpret public inheritance as an "is-a" relationship. In this case, a *`Player` is a `Character`*. +There are different types of inheritance in C++: **public**, **protected**, and **private**. You probably won't need to use the latter two in this course, so we will focus on public inheritance. + +If the members of `Character` are private, `Player` will not be able to access them. If we want `Player` to access them, but maintain some level of encapsulation, we can use the `protected:` access specifier: +```cpp +class Character { +protected: + int health; + int power; +}; +``` + +This makes it so that only `Character` and its derived classes can access these members. + +Inheritance allows us to create new classes while reusing existing code. This is a key feature of OOP. Inheritance also allows for **polymorphism**. + +Suppose we want to change the behavior of the `take_damage` function for a `Player`. We can override the function in the `Player` class: +```cpp +class Player : public Character { + int score; +public: + Player(int h, int p, int s) : Character(h, p), score(s) {} + void take_damage(int damage) { + health -= damage / 2; + } +}; +``` + +Now, when we call `take_damage` on a `Player`, it will use the `Player` version of the function, not the `Character` version. + +In C++, pointers of a base class to point to objects of a derived class. In our case, we can make a `Character*` point to a `Player` object. After all, a `Player` is a `Character`. +```cpp +Character* player = new Player(100, 10, 0); +``` + +This has a few caveats. +First, we cannot access `score` through `player` because C++ does not know if `score` exists in this kind of character. If we're careful, we can cast `player` to a `Player*` to access `score`. +```cpp +Player* player = dynamic_cast(character); +if (player) { + std::cout << player->score << std::endl; +} +``` + +Dynamic casting is a way to safely cast a pointer of a base class to a pointer of a derived class. If the cast fails, `dynamic_cast` will return `nullptr`. + +Second, if we try to call `take_damage` on `player`, it will call the `Character` version of the function. To fix this, we need to make the function `virtual` in the base class: +```cpp +class Character { +public: + virtual void take_damage(int damage); // Add the virtual keyword +}; +``` + +When you make a function `virtual`, you are telling the compiler that this function can be overridden by derived classes. While we're at it, we should also mark the overriding function with the `override` keyword: +```cpp +class Player : public Character { +public: + void take_damage(int damage) override; // Add the override keyword +}; +``` + +The `override` keyword is not strictly necessary, but it ensures that the function is overriding the correct function in the base class. + +Now, when we call `take_damage` on a `Character*`, it will call the correct version of the function. +```cpp +Character* player = new Player(100, 10, 0); +player->take_damage(10); // Calls Player::take_damage +``` + +Third, when we delete a `Character*`, only the `Character` destructor will be called. This is bad since `Player` has its own resources to clean up. To fix this, we need to make the destructor `virtual` in the base class: +```cpp +class Character { +public: + virtual ~Character(); +}; +``` + +Polymorphism is tricky, but it can be a powerful tool when used correctly. We can have a collection of `Character*` pointers that point to different types of characters and call the correct functions on them. +```cpp +std::vector characters; +characters.push_back(new Player(100, 10, 0)); +characters.push_back(new Character(100, 10)); +for (Character* character : characters) { + character->take_damage(10); +} +``` + +It is possible for a class to inherit from multiple classes. The derived class will inherit all the members and member functions of the base classes. This is known as **multiple inheritance**. We will not go too in-depth on this topic. However, multiple inheritance can lead to a potential problem called the **diamond problem**. + +![Diagram of multiple base classes inheriting from a common base class.](image.png) + +Simply put, you cannot instantiate a derived class properly if it has two base classes that both inherit from the same class. To fix this, you can use **virtual inheritance**. You can read more about the diamond problem [here](https://www.geeksforgeeks.org/diamond-problem-in-cpp/). + +# Abstract Classes + +Earlier, we explained how the `virtual` keyword allows a function in a base class to be overridden by a derived class. In addition to the `virtual` keyword, we can add the pure specifier `= 0` to a function to make it a **pure virtual function**. +```cpp +class Entity { +public: + virtual void update() = 0; + virtual ~Entity() {} +}; +``` + +By using the pure specifier, we indicate that this function has *no implementation* in this class. You cannot write a body for a pure virtual function. + +When a class has at least one pure virtual function, it becomes an **abstract class**. Abstract classes cannot be instantiated. They are used as a base class for other classes that will provide implementations for the pure virtual functions. +```cpp +class Character : public Entity { +public: + void update() override { + // Update character + } +}; +``` + +If we choose not to override `update`, then `Character` will also be an abstract class. To make a concrete class, we must override all pure virtual functions. + +You can still create pointers of an abstract class type, so the rules of polymorphism still apply. + +Abstract classes are useful in a variety of situations: +- It may not always make sense to give every base class function an implementation. The class is too abstract. +- You might want to prevent a base class from being instantiated. The user must use a derived class. +- You need to enforce a derived class to implement a certain function with certain inputs and outputs. In this way, you create an interface for outside users to follow and a *contract* that derived classes must adhere to. + - This is the basis of how interfaces work in other languages. + +# Practice + +Let's go through a few practice problems to check your understanding of classes in C++. + +Consider the following code: +```cpp +class MyClass { + int x; +}; +``` +Which of the following statements are true? +- `x` is a public member variable +- `MyClass` does not have a constructor +- When constructed, `x` will be uninitialized +- `MyClass` is not copyable +- None of these are true + +
+Answer +When constructed, `x` will be uninitialized +
+ +--- + +Consider the following code: +```cpp +int main() { + MyClass obj; + MyClass* obj_ptr = new MyClass(); + return 0; +} +``` +Which of these objects will have their destructor called when `main` returns? +- Neither of them +- `obj` only +- `obj_ptr` only +- Both `obj` and `obj_ptr` + +
+Answer +`obj` only +
+ +--- + +Consider the following code: +```cpp +class MyClass { + int x; +public: + MyClass(int x) : x(x) {} +}; +``` +Which of the following statements NOT true? +- `MyClass` is default-constructible +- `MyClass` is copy-constructible +- `MyClass` is copy-assignable +- All of these are true + +
+Answer +`MyClass` is default-constructible +
+ +--- + +Consider the following code: +```cpp +int MyClass::get_x() const { + return x; +} +``` +What does the `const` keyword do in this context? +- The integer returned by `get_x` must be captured by a `const` variable +- The `get_x` function cannot be called on a `const` object +- The implementation of `get_x` cannot be changed +- The `get_x` function cannot modify the object's state +- It does nothing since the function does not return a reference + +
+Answer +The `get_x` function cannot modify the object's state +
+ +--- + +Consider the following code: +```cpp +class MyClass { + std::vector data = {1, 2, 3}; +public: + int& at(int i) { + return data.at(i); + } +}; +int main() { + const int& x = MyClass().at(0); + return 0; +} +``` +What is the issue with this code? +- `data` cannot be initialized in the class definition +- `at` is not a `const` member function +- `MyClass` is not instantiated correctly +- `data` is not cleaned up when after `main` returns + +
+Answer +`at` is not a `const` member function +
+ +--- + +Consider the following code: +```cpp +MyClass obj; +MyClass obj2 = obj; +``` +Which function was called as a result of the second line? +- The default constructor +- The copy constructor +- The copy assignment operator +- The destructor + +
+Answer +The copy constructor +
+ +--- + +What is the output of this code? +```cpp +struct Dog { + virtual void bark() {std::cout << "Woof!" << std::endl;} +}; +struct BigDog : Dog { + void bark() override {std::cout << "WOOF!" << std::endl;} +}; +int main() { + Dog* dog_ptr = new BigDog(); + dog_ptr->bark(); + delete dog_ptr; + return 0; +} +``` +- `Woof!` +- `WOOF!` +- `Woof!WOOF!` +- (compiler error) +- (runtime error due to ambiguity) + +
+Answer +`WOOF!` +
+ +# Conclusion + +That's it for this lesson! We've covered nearly everything you need to know about C++'s syntax and object-oriented programming in C++. We hope you've enjoyed these lessons and feel more comfortable with C++. + +# References + +- [C++ Reference](https://en.cppreference.com/w/cpp) +- [GeeksForGeeks](https://www.geeksforgeeks.org/c-plus-plus/) diff --git a/cpp-review/containers/lesson.md b/cpp-review/containers/lesson.md new file mode 100644 index 0000000..0ec29bf --- /dev/null +++ b/cpp-review/containers/lesson.md @@ -0,0 +1,573 @@ +# C++ Containers + +**Author:** *Brian Magnuson* + +In this lesson, we will be discussing C++'s containers library, one of the most important parts of the C++ standard library. + +Occasionally, I may refer the containers library as the **Standard Template Library (STL)**. STL actually includes containers, algorithms, and iterators. In this lesson, we will be discussing both containers and iterators. + +There are quite a few containers available in C++. Many of them use similar interfaces. So for this lesson, we will be start by covering `std::vector`, the most commonly used container in C++, in detail. Then we will discuss the other containers. + +In this lesson, we will cover the following topics: +- Vectors +- Iterators +- Linked lists +- Stacks, queues, and double-ended queues +- Priority queues +- Sets +- Maps + +We will not be covering `std::array` in this lesson. `std::array` is a fixed-size array that is a wrapper around a C-style array. You can read more about `std::array` [here](https://en.cppreference.com/w/cpp/container/array). + +And one more thing: The purpose of this lesson is to teach you how to use these containers, not to teach you to implement them or the theory behind them. If I were to talk about that, this single lesson would turn into a full-size course on data structures and algorithms. We'll discuss these topics in later lessons. + +# Vectors + +An `std::vector` is C++'s dynamic array class and is defined in the `` header. A **dynamic array** is an array that can grow and shrink in size. + +In actuality, dynamic arrays have a fixed capacity. When the array size reaches capacity, a new array of a larger capacity is allocated, and the elements are copied over. Usually, the old capacity is multiplied by some constant factor to get the new capacity, so this operation occurs less frequently as more elements are added. + +An `std::vector` can be initialized in a number of ways. Here are a few examples: +```cpp +// Create an empty vector +std::vector vec1; + +// Create a vector with 10 elements, all initialized to 0 (or default constructed) +std::vector vec2(10); + +// Create a vector with 10 elements, all initialized to 5 +std::vector vec3(10, 5); + +// Create a vector using an initializer list +std::vector vec4 = {1, 2, 3, 4, 5}; + +// Copy a vector +std::vector vec5(vec4); + +// Copy a vector using an iterator +std::vector vec6(vec4.begin(), vec4.end()); +``` + +## Empty, Size, and Clear + +I bring special attention to the `empty()`, `size()`, and `clear()` functions as almost all containers have these functions. + +- `empty()` returns `true` if the container is empty and `false` otherwise. +- `size()` returns the number of elements in the container. Usually a `size_t`, which is compatible with `unsigned int` (though GCC probably won't get mad if you use `int`). +- `clear()` removes all elements from the container. + +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +vec.empty(); // false +vec.size(); // 5 +vec.clear(); +vec.empty(); // true +vec.size(); // 0 +``` + +## Accessing Elements + +Vector elements can be accessed using the `[]` operator (sometimes called the subscript operator) or the `at()` function. The difference between the two is that `[]` does not do bounds checking, while `at()` does. If you try to access an element that is out of bounds using `at()`, an `std::out_of_range` exception will be thrown. + +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +vec[0]; // 1 +vec.at(0); // 1 +vec[5]; // Undefined behavior +vec.at(5); // Throws an exception +``` + +Unlike with maps, you cannot use the `[]` operator to insert elements into a vector. + +Both `[]` and `at()` return a reference to the element, so you can use them on the left side of an assignment statement to modify the element. Both have const and non-const versions, so you are free to use them with `const` variables. +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +vec[0] = 10; +vec.at(1) = 20; +``` + +You can use these functions in combination with the `size()` function to iterate over the elements of a vector. +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +for (size_t i = 0; i < vec.size(); i++) { + std::cout << vec[i] << " "; +} +``` + +This is the most basic way to iterate over a vector, but there are other ways which we will discuss later. + +Vectors allow you to access the front and back elements using the `front()` and `back()` functions, respectively. +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +vec.front(); // 1 +vec.back(); // 5 +``` + +## Inserting and Removing Elements + +With vectors, you can insert and remove elements at the end of the vector using the `push_back()` and `pop_back()` functions, respectively. +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +vec.back(); // 5 +vec.push_back(6); +vec.back(); // 6 +vec.pop_back(); +vec.back(); // 5 +``` + +As mentioned, when the vector reaches capacity, a new array is allocated and the elements are copied over. Because of this, the worst-case time complexity of `push_back()` linear in container size. However, the amortized time complexity is constant and we generally treat it as such. + +If you happen to know how much space you will need, you should make sure to initialize the vector with the size you need to avoid unnecessary reallocations. + +## Comparing Vectors + +Vectors can be compared using the `==` and `!=` operators. The comparison is done element-wise. +```cpp +std::vector vec1 = {1, 2, 3, 4, 5}; +std::vector vec2 = {1, 2, 3, 4, 5}; + +vec1 == vec2; // true +vec1 != vec2; // false +``` + +Since each element is compared, the worst-case time complexity of comparing two vectors is linear in the size of the vectors. + +# Iterators + +Most containers have a public inner class called an **iterator**. An iterator is an object that allows you to traverse the elements of a container. Iterators are similar to pointers in that they can be dereferenced and incremented. For a vector, it is something like `std::vector::iterator`, where `T` is the type of the elements in the vector. + +There are a few different types of iterators: +- Forward Iterators allow you to only move forward through the container. + - You can use the `++` operator to move to the next element. +- Bidirectional Iterators allow you to move forward and backward through the container. + - They include all the functionality of Forward Iterators. + - You can also use the `--` operator to move to the previous element. +- Random Access Iterators allow you access any element in the container in constant time. + - They include all the functionality of Bidirectional Iterators. + - You can also use the `+`, `-`, `+=`, and `-=` operators to move to any element in the container. + +Vectors have random access iterators, so you can use the `+`, `-`, `+=`, and `-=` operators to move to any element in the vector. Not all containers have random access iterators, so you should be careful when using these operators. + +To demonstrate iterators, we'll loop through the elements of a vector using them: +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +for (auto it = vec.begin(); it != vec.end(); it++) { + std::cout << *it << " "; +} +``` + +Most containers have a `begin()` and `end()` function. +The `begin()` function returns an iterator to the first element, and the `end()` function returns an iterator to *one past the last element* of the container. Dereferencing the iterator returned by `end()` is undefined behavior. + +Here, we initialize `it` to the iterator returned by `begin()`. We use the `auto` keyword since writing the type of the iterator can be quite verbose. We then loop through the elements of the vector, dereferencing the iterator to get the element. + +If all we need is to iterate through the elements in order, we can use a range-based for loop. This is a more concise way to iterate through the elements of a container. +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +for (int i : vec) { + std::cout << i << " "; +} +for (int& i : vec) { // Same as above, but we use a reference instead of a copy + i++; +} +``` + +A range-based for loop uses iterators under the hood and automatically dereferences the iterator for you. Theoretically, you can write your own container class and use a range-based for loop with it, as long as you provide the `begin()` and `end()` functions. + +There are other functions that make use of iterators. + +- `insert()` allows you to insert elements at a specific position in the container. + - For vectors, the elements after the position need to be shifted over, so the time complexity of this operation is linear in container size. +- `erase()` allows you to remove elements at a specific position in the container. + - Similar to `insert()`, the elements after the position need to be shifted over. + +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +auto it = vec.begin() + 2; +vec.insert(it, 10); // {1, 2, 10, 3, 4, 5} +vec.erase(it); // {1, 2, 3, 4, 5} +``` + +There are also functions defined in the `` header that make use of iterators. +- `std::sort()` sorts the elements of a container. + - The time complexity of this operation is `O(n log n)`, where `n` is the size of the container. + - This function's implementation usually combines multiple sorting algorithms to be as efficient as possible. +- `std::find()` searches for an element in a container. + - The time complexity of this operation is linear in the size of the container. + +```cpp +std::vector vec = {5, 3, 1, 4, 2}; +std::sort(vec.begin(), vec.end()); // {1, 2, 3, 4, 5} +std::find(vec.begin(), vec.end(), 3); // Returns an iterator to the element 3 +``` + +And that's about everything you need to know about vectors and iterators. We'll continue by covering the other containers and how they are different from vectors. + +# Linked Lists + +A linked list is a data structure that consists of nodes. Each node contains a value and a pointer to the next node in the list. The last node in the list points to `nullptr`. + +C++ provides two classes that implement linked lists: `std::forward_list` and `std::list`. They are single and doubly linked lists, respectively and are defined in the `` and `` headers. + +`std::list` is similar to an `std::vector`, but with the following differences: + +- `insert()` and `erase()` have constant time complexity. + - This is because the pointers in the nodes are updated, rather than shifting elements over. + - However, you still need the iterator to the element you want to insert or erase, and arriving at that element can still take linear time. +- `list` provides some additional functions for adding elements: + - `push_front()` and `pop_front()` add and remove elements from the front of the list. + - These functions have constant time complexity. +- `list` only provides a bidirectional iterator, so you can't use the `+`, `-`, `+=`, and `-=` operators to move to any element in the list. +- `list` does not have `operator[]` or `at()` to access elements by index. + +```cpp +std::list list = {1, 2, 3, 4, 5}; +auto it = list.begin(); +list.insert(it, 10); // {10, 1, 2, 3, 4, 5} +list.erase(it); // {1, 2, 3, 4, 5} +list.push_front(10); // {10, 1, 2, 3, 4, 5} +list.pop_front(); // {1, 2, 3, 4, 5} +``` + +`std::forward_list` is similar to `std::list`, but with the following differences: + +- `forward_list` is a singly linked list, so it only has a forward iterator. + - It still has an `end()` function, but since it is a forward iterator, you can't use it to iterate backwards. +- `forward_list` does not have a `size()` function (crazy, right?). +- `forward_list` does not have a `push_back()`, `pop_back()`, or `back()` function. + +`std::list` and `std::forward_list` are not very commonly used, but they still offer a few advantages over vectors in certain situations. For one, they make splitting and merging lists easier. They also have constant time complexity for inserting and erasing elements at any position. + +# Stacks, Queues, and Double-Ended Queues + +A **stack** is a data structure that allows you to add and remove elements from the top. This is known as a **last-in, first-out (LIFO)** data structure. + +C++ provides the `std::stack` class that implements a stack and is defined in the `` header. You'll find that it actually provides very few functions. Mainly just these: +- `push()` adds an element to the top of the stack. +- `pop()` removes the element at the top of the stack. +- `top()` returns a reference to the element at the top of the stack. +- `empty()` returns `true` if the stack is empty and `false` otherwise. +- `size()` returns the number of elements in the stack. + +Note that some other programming languages use `pop()` to also return the element at the top of the stack. C++ does not do this. You need to use `top()` to get the element at the top of the stack. + +```cpp +std::stack stack; +stack.push(1); +stack.push(2); +stack.push(3); +stack.top(); // 3 +stack.pop(); +stack.top(); // 2 +``` + +Stacks do not provide iterators, so you can't iterate through the elements of a stack. You can only access the top element. + +An `std::stack` is an example of a wrapper container. It uses another container to operate. The container is required to implement the `push_back()`, `pop_back()`, and `back()` functions. By default, `std::stack` uses an `std::deque` as the underlying container, but an `std::vector` or `std::list` can also be used. + +So if you were thinking you could use an `std::vector` as a stack, you'd be totally right. The main difference is that `std::stack` provides a more limited interface than `std::vector`. This can be a good thing, though, as it indicates to the programmer that the container should only be used as a stack. + +A **queue** is a data structure that allows you to add elements to the back and remove elements from the front. This is known as a **first-in, first-out (FIFO)** data structure. + +C++ provides the `std::queue` class that implements a queue and is defined in the `` header. + +It is similar to a stack in that it provides very few functions. Mainly just these: +- `push()` adds an element to the back of the queue. +- `pop()` removes the element at the front of the queue. +- `front()` returns a reference to the element at the front of the queue. +- `back()` returns a reference to the element at the back of the queue. +- `empty()` returns `true` if the queue is empty and `false` otherwise. +- `size()` returns the number of elements in the queue. + +```cpp +std::queue queue; +queue.push(1); +queue.push(2); +queue.push(3); +queue.front(); // 1 +queue.back(); // 3 +queue.pop(); +queue.front(); // 2 +``` + +Like `std::stack`, `std::queue` is not iterable and is a wrapper for another container. The container needs to implement the `push_back()`, `pop_front()`, `front()`, and `back()` functions, so an `std::deque` is suitable. + +A **double-ended queue** (deque) is a data structure that allows you to add and remove elements from both the front and back. It effectively combines the functionality of a stack and a queue. + +Double-ended queues are typically implemented as a circular array (an array with two pointers to keep track of the front and back) or a doubly linked list. + +C++ provides the `std::deque` class that implements a double-ended queue. It is similar to a vector and provides many of the same functions with the addition of: +- `push_front()` adds an element to the front of the deque. +- `pop_front()` removes the element at the front of the deque. + +`std::deque` also provides an iterator. If you need an iterable queue, an `std::deque` is a good choice. + +```cpp +std::deque deque; +deque.push_front(1); +deque.push_front(2); +deque.push_back(3); +deque.push_back(4); +deque.front(); // 2 +deque.back(); // 4 +deque.pop_front(); +deque.pop_back(); +deque.front(); // 1 +deque.back(); // 3 +``` + +An `std::deque` can even be initialized like a vector: +```cpp +std::deque deque = {1, 2, 3, 4, 5}; +for (int i : deque) { + std::cout << i << " "; +} +``` + +# Priority Queues + +A **priority queue** is a data structure that allows you to add elements with a priority and remove the element with the highest priority. + +Priority queues are typically implemented as a binary heap, a data structure that is a complete binary tree where each node is greater than or equal to its children. Most operations on a binary max heap are logarithmic in the number of elements. + +C++ provides the `std::priority_queue` class that implements a priority queue and is defined in the `` header. Like `std::stack` and `std::queue`, it is a wrapper for another container. The container needs to implement the `push_back()`, `pop_back()`, and `back()` functions, so an `std::vector` is suitable. + +`std::priority_queue` provides the following functions: +- `push()` adds an element to the priority queue. +- `pop()` removes the element with the highest priority. +- `top()` returns a reference to the element with the highest priority. +- `empty()` returns `true` if the priority queue is empty and `false` otherwise. +- `size()` returns the number of elements in the priority queue. + +```cpp +std::priority_queue pq; +pq.push(3); +pq.push(1); +pq.push(2); +pq.top(); // 3 +pq.pop(); +pq.top(); // 2 +``` + +Here, even though `1` was pushed before `2`, `2` has a higher priority because it is larger. + +By default, `std::priority_queue` is a max heap. But we can customize it by providing more template arguments. The second argument is the container to use, and the third argument is a comparison functor class. +```cpp +std::priority_queue, std::greater> pq; +pq.push(3); +pq.push(1); +pq.push(2); +pq.top(); // 1 +``` + +By default, `std::priority_queue` uses `std::less` as the comparison functor, which (confusingly) creates a max heap. If you want a min heap, you should use `std::greater`. + +If you're not familiar, a **functor** or **function object** is an instance of a class that overloads the `()` operator. It is used to provide a custom comparison function to the priority queue. + +The implementation of `std::less` is very simple. Usually something like this: +```cpp +template +struct less { + bool operator()(const T& lhs, const T& rhs) const { + return lhs < rhs; + } +}; +``` + +You can imagine `std::greater` is similar, but with the comparison operator flipped. + +Because C++ lambda functions are functors under the hood, you can use them in place of a functor class. +```cpp +auto cmp = [](int lhs, int rhs) { return lhs > rhs; }; +std::priority_queue, decltype(cmp)> pq(cmp); +``` + +This also makes it possible to use a priority queue with more complex objects. Suppose we wanted to use 2D points and compare them by their distance from the origin. +```cpp +struct ComparePoints { + bool operator()(const std::pair& lhs, const std::pair& rhs) const { + return lhs.first * lhs.first + lhs.second * lhs.second < rhs.first * rhs.first + rhs.second * rhs.second; + } +}; + +std::priority_queue, std::vector>, ComparePoints> pq; +``` + +In this priority queue, we use the `<` operator to compare the distances of the points from the origin. This means we are creating a max heap, and the first element in the priority queue will be the point farthest from the origin. + +# Sets + +A **set** is a data structure that stores unique elements. It is implemented using either a binary search tree or a hash table. + +C++ provides two classes that implement sets: `std::set` and `std::unordered_set`, defined in the `` and `` headers, respectively. +They are implemented using a binary search tree and a hash table, respectively. + +- `std::set` + - Elements are stored in sorted order. + - Most operations have a time complexity of `O(log n)`. + - The keys must be comparable. + - If the keys are not normally comparable, you can provide a comparison functor. +- `std::unordered_set` + - Elements are stored in an arbitrary order. + - Most operations have an amortized time complexity of `O(1)`. + - The keys must be hashable. + - If the keys are not normally hashable, you can provide a hash function. + +Both sets provide the following functions: +- `insert()` adds an element to the set. +- `erase()` removes an element from the set. +- `find()` searches for an element in the set. + - Returns an iterator to the element if found, otherwise returns `end()`. + - Note: this is different from `std::find()`; this function uses a more efficient search algorithm. +- `empty()` returns `true` if the set is empty and `false` otherwise. +- `size()` returns the number of elements in the set. +- `count()` returns the number of elements in the set that are equal to a given value. + - This function is useful for checking if an element is in the set. + - Since sets only store unique elements, the return value will be `0` or `1`. + +```cpp +std::set set; +set.insert(1); +set.insert(2); +set.insert(2); // 2 will not be inserted again +set.size(); // 2 +set.erase(1); +set.size(); // 1 + +set.find(2); // Returns an iterator to 2 +set.find(3); // Returns set.end() +set.count(2); // 1 +set.count(3); // 0 +``` + +Both `std::set` and `std::unordered_set` provide iterators. The iterators are bidirectional for `std::set` and forward for `std::unordered_set`. + +```cpp +std::set set = {1, 2, 3, 4, 5}; +for (auto it = set.begin(); it != set.end(); it++) { + std::cout << *it << " "; +} +``` + +By default, `std::set` uses `std::less` as the comparison functor, which compares elements using the `<` operator. If you want to use a custom comparison function, you can provide a comparison functor as a template argument. This is useful for custom objects. We'll use the same `ComparePoints` functor from the priority queue example. +```cpp +std::set, ComparePoints> set; +set.insert({1, 2}); +set.insert({3, 4}); +``` + +For `std::unordered_set`, the default hash functor class is `std::hash`. You can provide a custom hash function if the keys are not normally hashable. The rules for creating a hash function are a little complicated, so we won't go too deep into it. They go something like this: +- The return type is a `size_t` (an unsigned integer type). +- The function should be deterministic (return the same value for the same input). +- The function should be fast to compute. +- The function should distribute the hash values evenly. + +You'll learn more about hash functions in a later lesson. + +There is also a `std::multiset` and `std::unordered_multiset` that allow duplicate elements. +They are also defined in the `` and `` headers, respectively. The `insert()` function will always insert the element, and the `count()` function will return the number of elements equal to a given value. + +# Maps + +A **map** is a data structure that stores key-value pairs. Similar to sets, maps are implemented using either a binary search tree or a hash table. + +C++ provides two classes that implement maps: `std::map` and `std::unordered_map`, defined in the `` and `` headers, respectively. + +- `std::map` + - Elements are stored in sorted order by key. + - Most operations have a time complexity of `O(log n)`. + - The keys must be comparable. + - If the keys are not normally comparable, you can provide a comparison functor. +- `std::unordered_map` + - Elements are stored in an arbitrary order. + - Most operations have an amortized time complexity of `O(1)`. + - The keys must be hashable. + - If the keys are not normally hashable, you can provide a hash function. + +Though maps can be default constructed, they can also be initialized using a list of pairs. Each pair represents a key-value pair. +```cpp +std::map map = {{1, "one"}, {2, "two"}, {3, "three"}}; +``` + +Maps do things a bit differently than sets, so we'll go over the functions they provide. + +## Inserting elements + +A typical way to insert elements into a map is to use the `insert()` function with an initializer list for a pair. +```cpp +std::map map; +map.insert({1, "one"}); +``` + +You can also use `emplace()`, which constructs the element in place and returns a pair containing an iterator to the element and a boolean indicating if the element was successfully inserted. +```cpp +map.emplace(2, "two"); +``` + +## Accessing elements + +You can access elements in a map using the `[]` operator or the `at()` function. + +The `[]` operator will locate the element with the given key and return a reference to the value. If the key does not exist, a new element will be created with the default value for the value type. +```cpp +std::map map = {{1, "one"}, {2, "two"}, {3, "three"}}; +map[1]; // "one" +map[4]; // "" +``` + +The `at()` function will locate the element with the given key and return a reference to the value. If the key does not exist, an `std::out_of_range` exception will be thrown. +```cpp +map.at(1); // "one" +map.at(5); // Throws an exception +``` + +Be careful about using the `[]` operator to access elements in a map. You might accidentally insert a new element if you aren't careful. + +## Iterating over elements + +Maps provide iterators that allow you to iterate over the elements. The iterators are bidirectional for `std::map` and forward for `std::unordered_map`. +However, they are slightly different from the other iterators we've seen. Iterators for maps are iterators to pairs of keys and values. + +```cpp +std::map map = {{1, "one"}, {2, "two"}, {3, "three"}}; +for (auto it = map.begin(); it != map.end(); it++) { + std::cout << it->first << ": " << it->second << " "; +} +for (auto pair : map) { + std::cout << pair.first << ": " << pair.second << " "; +} +for (auto& [key, value] : map) { + std::cout << key << ": " << value << " "; +} +``` + +The first loop is the most basic way to iterate over a map. The second loop uses a range-based for loop, which is more concise. The third loop uses structured bindings, which allows you to unpack the pair into separate variables (structured bindings were introduced in C++17). + +## Other map functions + +The rest of the functions provided by maps are similar to those provided by sets. + +- `erase()` removes an element from the map. +- `find()` searches for an element in the map. +- `empty()` returns `true` if the map is empty and `false` otherwise. +- `size()` returns the number of elements in the map. + +```cpp +std::map map = {{1, "one"}, {2, "two"}, {3, "three"}}; +map.erase(1); +if (map.find(2) != map.end()) { + std::cout << "Found" << std::endl; +} else { + std::cout << "Not found" << std::endl; +} +map.empty(); // false +map.size(); // 2 +``` + +# Conclusion + +And that's it for this lesson on C++ containers! In this lesson, we've only covered how to use these containers. We haven't discussed how they are implemented or the theory behind them. We'll discuss these topics in later lessons. + +# References + +- [C++ Reference](https://en.cppreference.com/w/cpp) diff --git a/cpp-review/memory/image.png b/cpp-review/memory/image.png new file mode 100644 index 0000000..3647114 Binary files /dev/null and b/cpp-review/memory/image.png differ diff --git a/cpp-review/memory/images-ppt-2.gif b/cpp-review/memory/images-ppt-2.gif new file mode 100644 index 0000000..acd1420 Binary files /dev/null and b/cpp-review/memory/images-ppt-2.gif differ diff --git a/cpp-review/memory/images-ppt.gif b/cpp-review/memory/images-ppt.gif new file mode 100644 index 0000000..31b3a0e Binary files /dev/null and b/cpp-review/memory/images-ppt.gif differ diff --git a/cpp-review/memory/images-ppt.pptx b/cpp-review/memory/images-ppt.pptx new file mode 100644 index 0000000..fc33af8 Binary files /dev/null and b/cpp-review/memory/images-ppt.pptx differ diff --git a/cpp-review/memory/lesson.md b/cpp-review/memory/lesson.md new file mode 100644 index 0000000..f280d9e --- /dev/null +++ b/cpp-review/memory/lesson.md @@ -0,0 +1,392 @@ +# C++ Memory and Arrays + +**Author:** *Brian Magnuson* + +In this lesson, we'll continue where we left off in our C++ review. + +In this lesson, we'll cover the following topics: +- Memory model +- Pointers +- Arrays +- Heap memory management +- Heap-allocated arrays +- Smart pointers + +# Memory Model + +The memory used by a process is often conceptually organized into segments: +- **Text Memory**: Contains the executable code. +- **Data Memory**: Contains global and static variables. +- **Stack Memory**: Contains local variables and function call information. +- **Heap Memory**: Contains dynamically allocated memory. + +![Diagram of the memory used by a process.](image.png) + +The C++ standard does not use the words "stack memory" and "heap memory" and instead uses the terms "automatic storage duration" and "dynamic storage duration". + +To avoid confusion, we'll use the terms "stack memory" and "heap memory" in this lesson and future lessons. This is also to avoid confusion between the terms "dynamic array" and "dynamically-allocated array", which are two different things. + +In practice, the memory used by a process is managed by the operating system. Techniques such as virtual memory and address space layout randomization (ASLR) are used to protect memory and prevent unauthorized access. While the conceptual segments (text, data, stack, heap) are useful for understanding, the actual memory layout can vary and is managed to ensure security and efficiency. + +Whenever you create a local variable in a function, it is stored in the stack memory. +```cpp +void foo() { + int x = 5; +} +``` +The memory for the variable is allocated when the function is called and deallocated when the function returns. + +Functions in C++ can call themselves, which is known as **recursion**. Each recursive call creates a new stack frame. If the recursion goes too deep, the stack will overflow and the program will crash. +```cpp +void foo(int x) { + if (x == 0) { + return; + } + std::cout << foo(x - 1) << std::endl; +} +``` + +Here, if you call this function with a value of `-1`, the function may recurse very deeply and cause a stack overflow. + +We'll discuss heap memory later in this lesson. + +# Pointers + +A **pointer** is a variable whose value is an address in memory. You can create a pointer to any type by using the type followed by an asterisk `*`. You can get the address of a variable by using the `&` operator in front of the variable name. +```cpp +int x = 5; +int* p = &x; +``` + +Some people also prefer to put the asterisk next to the variable name: `int *p`. This is a matter of personal preference. I prefer to put the asterisk next to the type name. + +If you were to print `p`, you would get the address of `x`. This could be any address in memory, so knowing the exact number is usually not useful. + +You can dereference a pointer by using the `*` operator in front of the pointer name. This will give you the value at the address stored in the pointer. +```cpp +std::cout << *p << std::endl; // 5 +``` + +The variable `p` does not store the value `5`. Only `x` stores the value of `5`. We can use dereferencing to change the value of `x` through the pointer. +```cpp +*p = 10; +std::cout << x << std::endl; // 10 +``` + +You might notice that this behavior is similar to using references. Indeed, you can use pointers in a similar fashion. However, pointers allow you to do a few more things such as pointer arithmetic and heap memory allocation. These can be dangerous if not used correctly. + +Typically, you should use references unless you have a reason to use pointers. + +You can add `const` in front of the pointer type. This does not make the pointer constant, but rather prevents users from modifying the value through the pointer. +```cpp +int y = 5; +const int* q = &x; +*q = 10; // Not allowed +q = &y; // Allowed +``` + +You can also use const after the asterisk to make the pointer itself constant. +```cpp +int* const r = &x; +r = &y; // Not allowed +*r = 10; // Allowed +``` + +And you can also use both. But it's a little overkill. +```cpp +const int* const s = &x; +``` + +You can also create a pointer to a pointer. This is sometimes useful when you want to modify a pointer in a function. +```cpp +int** pp = &p; +``` + +You can also create a `void*` pointer. This is arguably the most dangerous pointer type because it can point to any type. You can cast a `void*` pointer to any other pointer type. +```cpp +void* vp = &x; +int* ip = static_cast(vp); +``` + +You should avoid using `void*` pointers whenever possible. People may use them to get around type checking, but this can easily lead to bugs and there are many safer alternatives such as `std::any` and `std::variant` (These are part of the C++17 standard). + +Pointers are usually initialized to the address of a variable. However, you can also assign them to `nullptr` to indicate that they are not pointing to anything. +```cpp +int* p = nullptr; +``` + +`nullptr` and `NULL` are sometimes used interchangably, though both are effectively the same as `0`. Whenever you have a pointer, but no memory address to point to, you should always use `nullptr`. + +Finally, on the topic of dereferencing, you should never dereference a pointer to memory that you don't own. This includes `nullptr`. Attempting to do so can lead to undefined behavior. In many cases though, your operating system will give the process a `SIGSEGV` signal and terminate the program. This is also known **segmentation fault**. A segmentation fault occurs whenever you try to access memory that you don't own. + +# Arrays + +An **array** is a contiguous block of memory that stores elements of the same type. You can create an array using any of the following syntaxes: +```cpp +int arr1[5]; // Array of 5 integers +int arr2[] = {1, 2, 3, 4, 5}; // Array of 5 integers +int arr3[5] = {1, 2, 3, 4, 5}; // Array of 5 integers +``` + +Arrays require a size that can be determined at compile time (though some compilers allow variable-length arrays as an extension). The second line above infers the size of the array from the number of elements in the initializer list. + +You can access elements of an array using the subscript operator `[]`. The index starts at 0. +```cpp +std::cout << arr2[0] << std::endl; // 1 +``` + +There are few ways to get the size of an array. The first way is to use the `sizeof` operator. This actually gives you the size of the array in bytes, so you need to divide by the size of the type to get the number of elements. +```cpp +sizeof(arr2) / sizeof(arr2[0]); // 20 / 4 = 5 +``` + +***This only works for stack-allocated arrays with a size known at compile time.*** + +C++'s standard library provides an `std::array` wrapper class which provides a size member function. +```cpp +std::array arr4 = {1, 2, 3, 4, 5}; +std::cout << arr4.size() << std::endl; // 5 +``` + +Aside from this, the only reliable way to get the size of an array is to keep track of it yourself. And even this can be error-prone. Because of this, we recommend to use `std::vector` whenever possible. + +The `std::vector` class is a dynamic array that can grow and shrink in size. It is part of the C++ standard library and is much safer than using raw arrays. +```cpp +std::vector vec = {1, 2, 3, 4, 5}; +vec.push_back(6); // Can't do this with raw arrays! +std::cout << vec.size() << std::endl; // 6 +vec.pop_back(); +std::cout << vec.size() << std::endl; // 5 +``` + +Array-type variables, under the hood, are actually pointers to the first element of the array. This is why you can pass arrays around despite the fact that arrays can be of any size. + +If you try to perform arithmetic on an array, you can access elements before and after the first element. +```cpp +int arr[] = {1, 2, 3, 4, 5}; +std::cout << *(arr + 1) << std::endl; // 2 +int* ptr = arr; +std::cout << *(ptr + 1) << std::endl; // 2 +``` + +Whenever you add to an array or pointer, you are actually adding increments of the size of the type. Here, we add 1 to the pointer, but we are actually moving over `1 * sizeof(int)` bytes, which is 4 bytes. + +You can also use this method to access elements *before* the first element or *after* the last element. This can result in undefined behavior and is one of the most common ways to misuse pointers and can potentially lead to security vulnerabilities or a segmentation fault. +```cpp +int var = 6; +int arr[] = {1, 2, 3, 4, 5}; +std::cout << *(arr - 1) << std::endl; // Undefined behavior (but it might also print 6) +``` + +It goes without saying, but ***you should never access memory that you don't own***. + +# Memory Management + +Sometimes, you will need to allocate memory on the heap. This is useful for a few reasons: +- The stack is thought of as being limited in size while the heap is much larger, so it is suitable for allocating large amounts of memory. +- You can allocate memory on the heap and keep it around after the function returns. +- You may need to allocate an amount of memory that is not known at compile time (a requirement for pretty much any data structure). + +To allocate a single value on the heap, you can use the `new` operator along with the type and any arguments needed for the constructor. +```cpp +int* p = new int(5); +MyObject* obj = new MyObject(); +``` + +Note that `p` is a pointer. We've already used pointers to store addresses to stack memory. Here, we're storing the address of heap memory. You can't easily tell the difference between stack memory and heap memory just by looking at the pointer. + +Additionally, `p` is still a stack-allocated variable. The memory for `5` is allocated on the heap, not the pointer. This means that `p` can fall out of scope, just like any other variable. However, the memory for `5` *will not* be deallocated. So as long as we keep the value of `p` around, we can access the memory for `5`. + +If we, perhaps by accident, lose the pointer to the memory, then `5` is no longer accessible. The memory will be deallocated eventually when the process terminates, but if we keep allocating memory without freeing it, we can run out of memory. This is known as a **memory leak**. Programs that leak memory for long periods of time can consume all available memory and eventually crash. + +To deallocate memory, you can use the `delete` operator. This will deallocate the memory and call the destructor if the type has one. +```cpp +delete p; +delete obj; +``` +C++ does not provide garbage collection like Java or Python. Whenever you allocate memory with `new`, you must deallocate it with `delete`. Perhaps the worst part about this is that you cannot deallocate memory more than once. This is known as a **double free** and is also an error. So you can't just delete every pointer whenever you feel like it. + +Calling `delete` on a pointer to an class/struct object will also call the destructor for that object before deallocating the memory. If you set up your destructors wisely, calling `delete` on a pointer to an object can set off a chain of deletes that cleans up all the memory associated with that object. + +Another important point is that `delete` does not remove the pointer from scope. Usually this isn't a problem since you might not use the pointer again and the pointer will fall out of scope. However, this can be a problem if you have many pointers to the same object and you run `delete` on one of them. The other pointers will still point to the memory, but the memory is no longer valid. This is known as a **dangling pointer**. If you have a pointer to deallocated memory, you should set the pointer to `nullptr`. +```cpp +p = nullptr; +obj = nullptr; +``` + +Attempting to dereference a null pointer is still an error, but at least you can check if the pointer is null before dereferencing it. + +# Heap-Allocated Arrays + +We'll finish this lesson by discussing heap-allocated arrays. This is similar to allocating a single value on the heap, but uses a slightly different syntax. +```cpp +int* arr = new int[5]; +int* arr2 = new int[5] {1, 2, 3, 4, 5}; +MyObject* objs = new MyObject[5]; +``` + +Note: the third line only works for objects that are default-constructible. If the object is not default-constructible, you should either define a default constructor, use an initializer list, or override the `new[]` operator. + +You can access elements of the array using the subscript operator `[]` just like with stack-allocated arrays. +```cpp +std::cout << arr2[0] << std::endl; // 1 +``` + +Just like with stack allocated arrays, you should keep track of the size of the array yourself. The `sizeof` operator will not work for heap-allocated arrays. You should not try to access memory outside the bounds of the array. + +To deallocate a heap-allocated array, you should use the `delete[]` operator. +```cpp +delete[] arr; +delete[] arr2; +delete[] objs; +``` + +Do not use `delete[]` with an object allocated with `new` or use `delete` with an array allocated with `new[]`. This can lead to undefined behavior. + +If you are creating your own data structure and you intend to use heap-allocated arrays, you can use `delete[]` in the destructor of the class. This will ensure that the memory is deallocated when the object is destroyed. + +Most STL containers such as `std::vector` allocate memory on the heap. You don't need to use `new[]` or `delete[]` with these containers. The containers will handle memory management for you. For this reason, you should prefer to use STL containers whenever possible (unless we tell you otherwise for, say, a programming problem). + +# Practice + +Consider the following code: +```cpp +int i = 7; +int* p = &i; +``` +Which of the following does NOT evaluate to 7? +- `i` +- `*p` +- `p` +- `*(&i)` +- (all of these evaluate to 7) + +
+Answer +`p` +
+ +--- + +What is the output of the following code? +```cpp +#include +int main() { + int arr[] = {1, 2, 3, 4, 5}; + for (int i = 1; i < 4; i++) { + std::cout << arr[i]; + } + return 0; +} +``` +- 12345 +- 1234 +- 234 +- 2345 +- (results in undefined behavior/segmentation fault) + +
+Answer +234 +
+ +--- + +What is the output of the following code? +```cpp +#include +int main() { + int* p = new int(5); + std::cout << *(p + 1) << std::endl; + return 0; +} +``` +- 0 +- 5 +- 6 +- (results in undefined behavior/segmentation fault) + +
+Answer +(results in undefined behavior/segmentation fault) +
+ +--- + +What is the error in the following code? +```cpp +#include +int main() { + int* p = new int[5] {1, 2, 3, 4, 5}; + std::cout << p[3] << std::endl; + delete p; + return 0; +} +``` +- `p` should have the type `int[]`, not `int*`. +- The syntax after the `new` operator is incorrect for an array. +- The access to `p[3]` is out of bounds. +- The `delete` operator should be `delete[]` + +
+Answer +The `delete` operator should be `delete[]` +
+ +# Smart Pointers + +This next part is *completely optional*. However, since we're on the topic of memory management and best practices, I felt it was worth discussing smart pointers. + +Smart pointers are wrappers around raw pointers that provide automatic memory management. They are defined in the `` header and are part of the C++11 standard. + +C++ offers two main types of smart pointers: `std::shared_ptr` and `std::unique_ptr`. + +A `std::shared_ptr` can be initialized using the pointer's constructor or using the `std::make_shared` function (which accepts the arguments needed to construct the allocated object). +```cpp +std::shared_ptr p1(new int(5)); +std::shared_ptr p2 = std::make_shared(5); +``` + +Shared pointers use a memory management strategy called **reference counting**. +They keep track of how many shared pointers are pointing to a particular object. When the last shared pointer pointing to an object is destroyed, the object is deallocated. + +![Animation of shared pointers being copied, then deleted.](images-ppt.gif) + +One way to think about this is to imagine a group of people watching TV together. The first person to enter the room turns on the TV. Each person that enters the room has access to the TV remote. Then, when the last person leaves the room, they turn off the TV. + +Once initialized, smart pointers can be used almost like raw pointers. You can dereference them and pass them to functions that take pointers. +```cpp +auto p = std::make_shared>(1, 2); +std::cout << p->first << std::endl; +``` + +People who first learn about shared pointers might be tempted to use them for everything to avoid leaking memory. However, ***it is still possible to create memory leaks with shared pointers***. If you have a cycle of shared pointers, the reference count will never reach zero and the memory will never be deallocated. + +![Animation of heap objects referencing each other with shared pointers.](images-ppt-2.gif) + +One way to think about this is to imagine two people watching TV, and each person will only leave the room if the other person leaves first. In this setup, neither person will leave the room first, so the TV will never be turned off. + +This can easily happen with cyclic graphs, doubly-linked lists, or any other data structure that has pointers pointing to each other. + +To get around this, C++ offers `std::weak_ptr`, a pointer that does not increment the reference count. You can read more about weak pointers [here](https://en.cppreference.com/w/cpp/memory/weak_ptr). + +An `std::unique_ptr` is similar to an `std::shared_ptr`, but with one crucial difference: *it cannot be copied*. This also means that there is no need for reference counting since there can only ever be one `std::unique_ptr` pointing to an object. When the `std::unique_ptr` is destroyed, the object is deallocated. +```cpp +std::unique_ptr p1(new int(5)); +std::unique_ptr p2 = std::make_unique(5); +``` + +Since `std::unique_ptr` cannot be copied, you cannot easily pass it to a function that takes a pointer by value. The only way to keep the pointer around is to pass it by reference, or to use `std::move` to transfer ownership of the pointer. You can also release the pointer from the `std::unique_ptr`, though this negates the purpose of using a smart pointer in the first place. + +Rust developers may find these concepts familiar as it uses a similar system of ownership and borrowing for its pointers. + +For this course, you will not be expected to use smart pointers, and the projects that require memory management are fairly easy to manage manually. However, it's good to know about smart pointers for when you work on larger projects. + +# Conclusion + +That's all for this lesson. We've covered the memory model, pointers, arrays, and heap memory management. In the next lesson, we'll discuss classes and object-oriented programming. + +# References + +- [C++ Reference](https://en.cppreference.com/w/cpp) +- [Google C++ Style Guide](https://google.github.io/styleguide/cppguide.html)