diff --git a/pinecone/chunks.jsonl b/pinecone/chunks.jsonl
index a37521d..91ec687 100644
--- a/pinecone/chunks.jsonl
+++ b/pinecone/chunks.jsonl
@@ -8,3 +8,1567 @@
{"id": "../pages/digitalGarden/index.mdx#8", "metadata": {"Header 1": "My Digital Garden", "Header 2": "The Features", "Header 3": "PlantUML", "path": "../pages/digitalGarden/index.mdx", "id": "../pages/digitalGarden/index.mdx#8", "page_content": "If you ever need to create diagrams and especially UML diagrams, PlantUML is the way to go. I started with Mermaid\nto create UML diagrams but swapped to PlantUML for the additional features and the ability to create custom themes\n(so everything can be minimalist and purple :D). \nTo render PlantUML diagrams the [Remark plugin Simple PlantUML](https://github.com/akebifiky/remark-simple-plantuml) is\nused which uses the official PlantUML server to generate an image and then adds it. \nAn Example can be seen below, on the [official website](https://plantuml.com/) and also on [REAL WORLD PlantUML](https://real-world-plantuml.com/?type=class). \n```plantuml\n@startuml\n\ninterface Command {\nexecute()\nundo()\n}\nclass Invoker{\nsetCommand()\n}\nclass Client\nclass Receiver{\naction()\n}\nclass ConcreteCommand{\nexecute()\nundo()\n}\n\nCommand <|-down- ConcreteCommand\nClient -right-> Receiver\nClient --> ConcreteCommand\nInvoker o-right-> Command\nReceiver <-left- ConcreteCommand\n\n@enduml\n``` \nTo use my custom theme you can use the following line at the beginning of the PlantUML file: \n```\n@startuml\n!theme purplerain from http://raw.githubusercontent.com/LuciferUchiha/georgerowlands.ch/main\n\n...\n\n@enduml\n``` \nHowever, it seems like when using a custom theme There can not be more then one per page? My custom theme also has some processes built in for simple text coloring for example in cases of success, failure etc. \n```plantuml\n@startuml\n!theme purplerain from http://raw.githubusercontent.com/LuciferUchiha/georgerowlands.ch/main\n\nBob -> Alice : normal\nBob <- Alice : $success(\"success: Hi Bob\")\nBob -x Alice : $failure(\"failure\")\nBob ->> Alice : $warning(\"warning\")\nBob ->> Alice : $info(\"finished\")\n\n@enduml\n```"}}
{"id": "../pages/digitalGarden/index.mdx#9", "metadata": {"Header 1": "My Digital Garden", "Header 2": "How can I Contribute?", "path": "../pages/digitalGarden/index.mdx", "id": "../pages/digitalGarden/index.mdx#9", "page_content": "Do you enjoy the content and want to contribute to the garden by adding some new plants or watering the existing ones?\nThen feel free to make a pull request. There are however some rules to keep in mind before adding or changing content. \n- Markdown filenames and folders are written in camelCase.\n- Titles should follow the\n[IEEE Editorial Style Manual](https://www.ieee.org/content/dam/ieee-org/ieee/web/org/conferences/style_references_manual.pdf).\nThey should also be added to the markdown file and specified in the `_meta.json` which maps files to titles and is also\nresponsible for the ordering.\n- LaTeX should conform with my notation and guideline, if something is not defined there you can of course add it."}}
{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/analysisOfAlgorithms.mdx#1", "metadata": {"Header 1": "Analysis of Algorithms", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/analysisOfAlgorithms.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/analysisOfAlgorithms.mdx#1", "page_content": "Asymptotic Complexity / Analysis of Algorithms \nThe master method and how to calculate it and stuff, go back to algd1, MIT 6.006 and Algorithms Illuminated will help. \nTelescoping? How to get to recurrance relation and then asymptotic complexity."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#1", "metadata": {"Header 1": "Bags", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#1", "page_content": "A bag is a data structure that can contain the same element multiple times which is why it is often also called a multiset. The order of adding elements is not necessarily given, this depends on the implementation. Common operations on a bag are adding elements, removing elements and searching for a specific element."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#2", "metadata": {"Header 1": "Bags", "Header 2": "Implementing a Bag", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#2", "page_content": "One of the simplest ways of implementing data structures is by using arrays. When implementing a data structure the time complexities can be different on whether the data is always in a sorted state or not. \n\n```java filename=\"UnsortedBag.java\"\n// TODO\n```\n \nWhen implementing a sorted collection in Java you can either implement your own binary search or you can use `java.util.Arrays.binarysearch(a, from, to, key)` which returns the index of the key, if it is contained and otherwise $(-(insertion point) - 1)$ with insertion point being the point where the key would be inserted, i.e the index of the first element greater than the key. \n\n```java filename=\"SortedBag.java\"\n// TODO\n```\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#3", "metadata": {"Header 1": "Bags", "Header 2": "Implementing a Bag", "Header 3": "Time Complexities", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#3", "page_content": "| Operation | UnsortedBag | SortedBag |\n| ---------------- | ------------------------------------------ | ----------------------------------------------------- |\n| add(E e) | $O(1)$
no search, or shift | $O(n)$
search + shift right $O(\\log{n}) + O(n)$ |\n| search(Object o) | $O(n)$
linear search | $O(\\log{n})$
binary search |\n| remove(Object o) | $O(n)$
search + remove $O(n) + O(1)$ | $O(n)$
search + shift left $O(\\log{n}) + O(n)$ |\n| Ideal use case | When adding a lot | When searching a lot |"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#4", "metadata": {"Header 1": "Bags", "Header 2": "Bag of Words", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/bags.mdx#4", "page_content": " \nWhat is a bag of words? How is it used in NLP?\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/collections.mdx#1", "metadata": {"Header 1": "Collections", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/collections.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/collections.mdx#1", "page_content": "Collections are containers/data structures that can hold elements of the same type. Most programming languages have some basic implementations as part of their standard library. Depending on the problem to be solved certain data structures are better options than others. In Java, there is the `java.util.Collections` package which contains some of the most common collections. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#1", "metadata": {"Header 1": "Hash Tables", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#1", "page_content": "In an ideal world, we would want to be able to access data with $O(1)$ using the data's unique identifier (key). \n \nFor this, to work we need to be able to generate a unique hash code from the key. From this hash code (a number) we then want to get an index in a hash table by using a hash function. For this approach to work, two conditions must be met. Firstly we need to be able to know if two objects are the same (the `equals` function) secondly we need to be able to generate a hash code from the unique identifier which can consist of a combination of attributes or just one. \nImportantly the following must be true: \n$$\n(a.equals(b)) \\Rightarrow (a.hashCode() == b.hashCode())\n$$ \nSo if two objects are the same then their hash Code must be the same as well. However, if two hash codes are the same it does not necessarily mean that the objects are the same, this is a so-called collision."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#2", "metadata": {"Header 1": "Hash Tables", "Header 2": "Hashing Function", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#2", "page_content": "We want to be able to calculate the index as fast as possible. From the above requirements, we also want the same keys to produce the same indices. We also want the hash codes and therefore the indices to be evenly distributed to minimize collisions. \nFor starters we could use the following hashing function: \n$$\nindex = hash\\,code \\mod table.length()\n$$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#3", "metadata": {"Header 1": "Hash Tables", "Header 2": "Hash Code", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#3", "page_content": "We want the generated hash code to be randomly and if possible evenly spread across the entire range of possible numbers. \nIf the unique identifier is a 32-bit data type, like boolean, byte, short, int, char and float we can just take its value straight as an int. \nIf the unique identifier is a 64-bit data type, like long or double we can use an exclusive or (XOR, only true if they are different) between the two 32-bit parts. \n```java\npublic int hashCode() {\n// XOR of two 32-bit parts\nreturn (int)(value ^ (value >>> 32));\n}\n``` \nFor strings, it gets a bit harder. You might think it would be a good idea to add the characters represented as integers together. However, this is a very bad idea because for example AUS and USA would then have the same hash code. Instead, we create a polynomial using the character values as coefficients. \n```java\npublic final class String {\nprivate final char value[];\n/** Cache the hash code for the string, to avoid recalculation */\nprivate int hash; // Default to 0\n...\npublic int hashCode() {\nint h = hash;\nif (h == 0 && value.length > 0) {\nchar val[] = value;\nfor (int i = 0; i < value.length; i++) {\nh = 31 * h + val[i];\n}\nhash = h;\n}\nreturn h;\n}\n...\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#4", "metadata": {"Header 1": "Hash Tables", "Header 2": "HashMap", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#4", "page_content": "In Java, a HashMap always has a size equal to a power of 2. This leads to the map reserving in the worst case twice as much memory as it needs. However, the advantage of this implementation is that it is very easy to calculate powers of 2 with bit shifts. It also allows us to change the hash function `(hashCode() & 0x7FFFFFFF) % length` to `hashCode() & (length -1)`. The bitmask with `0x7FFFFFFF` ensures that the hash code is positive. \n```java\npublic HashMap(int initialCapacity) {\nint capacity = 1;\nwhile (capacity < initialCapacity)\ncapacity <<= 1;\ntable = new Entry[capacity];\n}\n\nprivate int indexFor(int h) {\nreturn h & (table.length - 1);\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#5", "metadata": {"Header 1": "Hash Tables", "Header 2": "Collision Resolution", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#5", "page_content": "As mentioned before collisions are when different objects have the same hash code and therefore the same index. This can happen and can't be avoided. This is why they need to be handled."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#6", "metadata": {"Header 1": "Hash Tables", "Header 2": "Collision Resolution", "Header 3": "Separate Chaining", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#6", "page_content": "With this strategy when there is a collision, the colliding elements are chained together just like with a linked list. The advantage of this strategy is that it is very simple and the table never becomes full. The problem however is that it needs additional memory and the memory needs to be dynamic. \n \nThe class for a HashMap would then look something like this: \n```java\npublic class HashMap < K, V > implements Map < K, V > {\nNode [] table;\n...\nstatic class Node implements Map.Entry {\nfinal K key;\nV value;\nNode next;\n...\n}\n}\n``` \nIf the table has the size $m$ and we insert $n$ elements we can calculate the probability of a collision using the following formula: \n$$\n\\prod_{i=0}^{n-1}{\\frac{m-i}{m}}\n$$ \nFrom this we can then also calcualte the probability of there being at least 1 collision: \n$$\n1 - \\prod_{i=0}^{n-1}{\\frac{m-i}{m}}\n$$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#7", "metadata": {"Header 1": "Hash Tables", "Header 2": "Collision Resolution", "Header 3": "Open Addressing", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#7", "page_content": "With this strategy when there is a collision, we look for a free space in the hash table. The advantage of this strategy is that it does not need any additional space however the table can become full. The process of finding a free space is called probing. \n#### Linear Probing \nWhen using linear probing we try the next highest index until we find a free space. If we reach the end of the table we restart the search at index 0 until we are back to the initial area of collision which means the table is full. \nSo if the hash code is $x$ and the table has the size $m$ the index after $k$ collisions is: \n$$\nindex= (x \\mod m + k) \\mod m\n$$ \n```java\npublic void add(T elem) {\nint i = (elem.hashCode() & 0x7FFFFFFF) % size;\nwhile (array[i] != null)\ni = (i + 1) % size;\narray[i] = elem;\n}\n``` \nThe above code however doesn't check if the hash table should only hold unique values (set semantic) or if the table is already full. However, with this strategy clusters of values can form. When adding a value you then just make the cluster even bigger and therefore also the probability of hitting a cluster. \nWhen inserting into a table of size $n$ with a cluster of size $k$ we can calculate the probability of hitting the cluster and therefore also increasing the size of the cluster: \n$$\n\\frac{k+2}{n}\n$$ \nWe can also calculate the probability of needing at least 3 probe steps when adding which is: \n$$\n\\frac{k-2}{n}\n$$ \n##### Double Hashing \nThe idea here is that we don't look at the next highest free space, which is equivalent to a step size of 1 but that each element calculates a step size for itself. This is done to avoid creating clusters. This strategy is called double hashing as you have a hash function for the index and the step size. \nSo if the hash code is $x$ and the table has the size $m$ the index after $k$ collisions is: \n$$\nindex= (x \\mod m + k \\times step) \\mod m\n$$ \n```java\npublic void add(T elem) {\nint i = (elem.hashCode() & 0x7FFFFFFF) % size;\nint step = ...?\nwhile (array[i] != null) {"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#8", "metadata": {"Header 1": "Hash Tables", "Header 2": "Collision Resolution", "Header 3": "Open Addressing", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#8", "page_content": "$$\nindex= (x \\mod m + k \\times step) \\mod m\n$$ \n```java\npublic void add(T elem) {\nint i = (elem.hashCode() & 0x7FFFFFFF) % size;\nint step = ...?\nwhile (array[i] != null) {\ni = (i + step) % size;\n}\narray[i] = elem;\n}\n``` \nHowever, we need to be very careful when choosing the step size otherwise the problem of clusters becomes even worse. Some obvious bad examples would be a step size of 0 or the size of the table. To avoid this we can restrict the step size with the following condition: \n$$\nggt(step, m)= 1 \\text{ (coprime/teilerfremd) } \\land 0 < step < m\n$$ \nSome common choices are: \n- The size of the table $m$ is a power of 2 and a step is an odd number $\\in [1, m-1]$. \n```java\n1 + 2 * ((elem.hashCode() & 0x7FFFFFFF) % (m / 2))\n``` \n- The size of the table $m$ is a prime number and a step is $\\in [1, m-1]$. \n```java\n1 + (elem.hashCode() & 0x7FFFFFFF) % (m – 2)\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#9", "metadata": {"Header 1": "Hash Tables", "Header 2": "Removing Elements", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#9", "page_content": "When removing an element it can't just be set to `null` because otherwise when looking for an element after the deletion we could hit a null reference and crash before we find the element we are looking for (depending on language and implementation). Instead of setting it to `null` it is common practice to set it to a sentinel object. If we are then looking for an element and we hit a sentinel we can just carry on our search. This then also means that when we add an element and we come across a sentinel we can add the element in place of the sentinel. \n```java\npublic class HashTable < T > {\nprivate final Object[] arr;\nprivate static final Object sentinel = new Object();\n...\npublic void remove(Object o) {\nassert o != null;\nint i = (o.hashCode() & 0x7FFFFFFF) % arr.length;\nint cnt = 0;\nwhile (arr[i] != null && !o.equals(arr[i]) && cnt != arr.length) {\ni = (i + 1) % arr.length;\ncnt++;\n}\nif (o.equals(arr[i])) arr[i] = sentinel;\n}\npublic boolean contains(Object o) {\nassert o != null;\nint i = (o.hashCode() & 0x7FFFFFFF) % arr.length;\nint cnt = 0;\nwhile (arr[i] != null && !o.equals(arr[i]) && cnt != arr.length) {\ni = (i + 1) % arr.length;\ncnt++;\n}\nreturn cnt != arr.length && arr[i] != null;\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#10", "metadata": {"Header 1": "Hash Tables", "Header 2": "Performance Improvements", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#10", "page_content": "Using modulo in the probe loop causes is not optimal because of the multiple divisions that need to be calculated. \nSo instead of `i = (i + step) % size;` we can use one of the following: \n- If the table size $m$ is a power of 2 we can use a bitmask which is very fast. \n```java\ni = (i + step) & (size - 1);\n``` \n- Instead of using modulo, we could also manually detect an overflow. \n```java\ni = i + step; if (i >= size) i -= size;\n``` \n- Because a comparison with 0 is faster than with a given number we could also probe backward and check for an underflow. \n```java\ni = i - step; if (i < 0) i += size;\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#11", "metadata": {"Header 1": "Hash Tables", "Header 2": "Load Factor", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#11", "page_content": "The number of collisions increases with the number of elements in the table. To be able to make statements on the status of the table there is the so-called load factor which is defined as followed: \n$$\n\\lambda = \\frac{\\text{Number of element in table}}{\\text{table size}}\n$$ \nIf we know the number of elements to be added we can then also calculate an optimal size for the table depending on the desired load factor. \nWe can also create a new table and copy all the elements to the new table if a certain threshold load factor has been reached. However, it is important to recalculate the indices when doing this. This process is called **rehashing**. \nWhen searching for an element in a hash table that is using the separate chained strategy we expect to find the element after half the load factor so $O(1+\\frac{\\lambda}{2})$. If a search is unsuccessful then the waste is $O(1+\\lambda)$ because the entire list was searched."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#12", "metadata": {"Header 1": "Hash Tables", "Header 2": "Load Factor", "Header 3": "Separate Chaining", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#12", "page_content": "There is no upper limit for the load factor as the chains can be of any length. The average length is equivalent to the load factor. For the table to be efficient the load factor should be $\\lambda < 1$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#13", "metadata": {"Header 1": "Hash Tables", "Header 2": "Load Factor", "Header 3": "Open Addressing", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/hashTables.mdx#13", "page_content": "The load factor is limited to $\\lambda \\leq 1$. As long as $\\lambda < 1$ there is still space in the table. For optimal performance, it is recommended to have a load factor of $\\lambda < 0.75$ for linear probing and double hashing $\\lambda < 0.9$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#1", "metadata": {"Header 1": "Linked Lists", "Header 2": "Linked Lists vs Arrays", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#1", "page_content": "When implementing collections with arrays we can encounter a few issues. An array uses a fixed given size which leads us to implementing algorithms that only work for that fixed amount. To solve this issue when adding elements we could also make an array that is one size larger, copy everything over and then add the new element. Another approach is when the array gets full we increase its size by either a fixed amount that could also change depending on how many times we have already increased the size. Meaning the array is either always full or we use to much space. \nYou can imagine a linked to be like a chain. It consists of nodes that have a value and a reference of the next node. The linked list then just needs to know the first node and can then make its way through the list. With this method the size of the collection is dynamic and we can add as many elements as we want (limited by memory). \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#2", "metadata": {"Header 1": "Linked Lists", "Header 2": "Variations", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#2", "page_content": "There are various variations of linked lists which all have there use cases."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#3", "metadata": {"Header 1": "Linked Lists", "Header 2": "Variations", "Header 3": "Singly Linked List", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#3", "page_content": "This is the common implementation when talking about linked lists. A node has a value and a reference to the next element."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#4", "metadata": {"Header 1": "Linked Lists", "Header 2": "Variations", "Header 3": "Doubly Linked List", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#4", "page_content": "Here unlike the singly linked list a node has a value, a reference to the next element and additionally also a reference to the previous element. This makes removal of node much easier. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#5", "metadata": {"Header 1": "Linked Lists", "Header 2": "Variations", "Header 3": "Circular Linked List", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#5", "page_content": "In a circular linked list the last element does not have a reference to null as the next element but instead the head which allows the linked list to be visualized as a circle. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#6", "metadata": {"Header 1": "Linked Lists", "Header 2": "Implementing a Linked List", "Header 3": "Adding", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#6", "page_content": "When implementing the `add(E e)` function there are a few options: \n- You can iterate your way through the linked list to the end and then add the new element onto the end. This however has a complexity of $O(n)$ which is not ideal for a simple operation.\n- To solve the above issue we can keep a private reference in the list of not only the head but also the tail (last element) of the linked list.\n- There is no rule saying you have to add an element at the end. You can also just add it to the front of the list, so it becomes the new head and its reference to the next node is the old head."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#7", "metadata": {"Header 1": "Linked Lists", "Header 2": "Implementing a Linked List", "Header 3": "Removing", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#7", "page_content": "When implementing the `remove(Object o)` function there is only really one way of doing it and that is to find the node that holds the value to be removed `curr` whilst also remembering the previous node `prev` and then setting the reference of the `prev.next` to `curr.next`. This can be made easier as mentioned above by storing in each node a reference to the previous element to make it a doubly linked list. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#8", "metadata": {"Header 1": "Linked Lists", "Header 2": "Implementing a Linked List", "Header 3": "Containing", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#8", "page_content": "When implementing the `boolean contains(Object o)` you have to iterate over the entire linked list to see if you find the element or reach the end."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#9", "metadata": {"Header 1": "Linked Lists", "Header 2": "Implementing a Linked List", "Header 3": "Example", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/linkedLists.mdx#9", "page_content": "\n```java filename=\"MySingleLinkedList.java\"\n// TODO\n```\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#1", "metadata": {"Header 1": "What is NP-Hard?", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#1", "page_content": "Lots of Euler diagrams and examples needed. Clear formulations seem to be hard to find."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#2", "metadata": {"Header 1": "What is NP-Hard?", "Header 2": "Deterministic vs Non-Deterministic Algorithms", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#2", "page_content": "example of deterministic and non-deterministic algorithms \nleaving a blank part"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#3", "metadata": {"Header 1": "What is NP-Hard?", "Header 2": "P and NP", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#3", "page_content": "What is the stuff with the verification in polynomial time? Is mentioned but unsure how exactly need and example"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#4", "metadata": {"Header 1": "What is NP-Hard?", "Header 2": "NP-Complete and NP-Hard", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#4", "page_content": "NP Complete is NP-Hard but als has an algo in NP? \nBut solving one NP-Complete problem in polynomial time means all NP-Complete problems can be solved in polynomial time???\nSame goes for if one NP-Hard problem can be solved in polynomial time then all NP problems can be solved in polynomial time?"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#5", "metadata": {"Header 1": "What is NP-Hard?", "Header 2": "NP-Complete and NP-Hard", "Header 3": "Reduction", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#5", "page_content": "The conversion of one problem to another has to be in polynomial time???"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#6", "metadata": {"Header 1": "What is NP-Hard?", "Header 2": "NP-Complete and NP-Hard", "Header 3": "Boolean Satisfiability Problem (SAT)", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#6", "page_content": "CNF (Conjunctive Normal Form) ??? and then reduce to 0/1 Knapsack Problem"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#7", "metadata": {"Header 1": "What is NP-Hard?", "Header 2": "NP-Complete and NP-Hard", "Header 3": "Cook-Levin Theorem", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#7", "page_content": "Got prize for proving what it means if P = NP???"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#8", "metadata": {"Header 1": "What is NP-Hard?", "Header 2": "BQP", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/np.mdx#8", "page_content": "Quantum stuff"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx#1", "metadata": {"Header 1": "Queues", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx#1", "page_content": "A queue is as the name says like a queue of people. Meaning it follows the FIFO policy (first in first out). The most common operations on queues are: \n- `enqueue(E e)`: Adds an element to the rear of the queue.\n- `E dequeue()`: Takes the element from the front of the queue.\n- `E peek()`: Returns the element at the front of the queue, which corresponds to the element to next be dequeued. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx#2", "metadata": {"Header 1": "Queues", "Header 2": "Implementing a Queue", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx#2", "page_content": "\n```java filename=\"MyQueue.java\"\n// TODO\n```\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx#3", "metadata": {"Header 1": "Queues", "Header 2": "Implementing a Queue", "Header 3": "Queue Using two Stacks", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/queues.mdx#3", "page_content": "Although the most common way of implementing a queue is with a [linked list](./linkedLists) it is also possible to implement a queue by using two stacks. Just like when [implementing a stack with two queues](./stacks#stack-using-two-queues) you need to decide if adding or removing an element will be expensive."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/recursion.mdx#1", "metadata": {"Header 1": "Recursion", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/recursion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/recursion.mdx#1", "page_content": "SRTBOT from MIT to design a recursive function merge sort as example:\n- Subproblem identification and definition\n- Relate subproblem solution to the original problem with a recurrence relation\n- Topological order of subproblems (order in which subproblems are solved) to avoid circular dependencies, i.e.\nwe want it to be a DAG\n- Base case(s) to terminate recursion\n- Original problem solution via subproblem solutions\n- Time and space complexity analysis \nsome blabla about recursion. Can every recursive function be written as an iterative function? What about the other way around? \nWhy would you use recursion? What are the advantages and disadvantages? \nTailed recursion and the impact on the stack."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx#1", "metadata": {"Header 1": "Sets", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx#1", "page_content": "A set is a data structure that can hold unique elements. It represents a mathematical set which in german is called a \"Menge\". This means that an element is either in the set or it isn't. Just like with a bag you have the common operations of adding elements, removing elements and searching for a specific element."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx#2", "metadata": {"Header 1": "Sets", "Header 2": "Implementing a Set", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx#2", "page_content": "\n```java filename=\"UnsortedSet.java\"\n// TODO\n```\n \nJust like when [implementing the bag](./bags#array-implementations) we can use `java.util.Arrays.binarysearch(a, from, to, key)` which returns the index of the key, if it is contained and otherwise $(-(insertion point) - 1)$ with insertion point being the point where the key would be inserted, i.e the index of the first element greater than the key. \n\n```java filename=\"SortedSet.java\"\n// TODO\n```\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx#3", "metadata": {"Header 1": "Sets", "Header 2": "Implementing a Set", "Header 3": "Time Complexities", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sets.mdx#3", "page_content": "When implementing a set and bag there is also the question of whether the data should be sorted or not. Depending on the answer the time complexities will be different and the implementation changes. \n| Operation | UnsortedSet | SortedSet |\n| ---------------- | ----------------------------------------------- | ----------------------------------------------------------------------------- |\n| add(E e) | $O(n)$
check (search) + add $O(n) + O(1)$ | $O(n)$
search insertion point (check) + shift right $O(\\log{n}) + O(n)$ |\n| search(Object o) | $O(n)$
linear search | $O(\\log{n})$
binary search |\n| remove(Object o) | $O(n)$
search + remove $O(n) + O(1)$ | $O(n)$
search insertion point (check) + shift left $O(\\log{n}) + O(n)$ |\n| Ideal use case | When set is needed but don't search a lot | When set is needed and a lot of searching |"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx#1", "metadata": {"Header 1": "Stacks", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx#1", "page_content": "A stack is as the name says like a stack of paper. Meaning it follows the LIFO policy (last in first out). The most common operations on queues are: \n- `push(E e)`: Puts the element onto the top of the stack.\n- `E pop()`: Takes the element from the top of the stack.\n- `E peek()`: Returns the element at the top of the stack, which corresponds to the element to next be popped. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx#2", "metadata": {"Header 1": "Stacks", "Header 2": "Implementing a Stack", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx#2", "page_content": "\n```java filename=\"MyStack.java\"\n// TODO\n```\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx#3", "metadata": {"Header 1": "Stacks", "Header 2": "Implementing a Stack", "Header 3": "Stack Using two Queues", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/stacks.mdx#3", "page_content": "Although the most common way of implementing a stack is with a [linked list](./linkedLists) it is also possible to implement a stack by using two queues. Just like when [implementing a queue with two stacks](./queues#queue-using-two-stacks) you need to decide if adding or removing an element will be expensive."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#1", "metadata": {"Header 1": "Coins in a Line", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#1", "page_content": "This game is a tricky little coding problem that works has the following rules: \n- There are an even number $n$ of coins in a line, with values $v_1, v_2, ..., v_n$, i.e. $v_i$ is the value of the i-th coin.\n- Two players, often called Alice and Bob, take turns to take a coin either from the left or the right end of the line\nuntil there are no more coins left.\n- The player whose coins have the higher total value wins. \n \nThe goal is to find an algorithm that maximizes the value of the coins that the first player (Alice) gets. \n\nThere are 4 coins with values [1, 2, 3, 4], Alice will get the maximum value of 6 by taking the\nlast coin twice (4 + 2), assuming Bob also plays optimally.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#2", "metadata": {"Header 1": "Coins in a Line", "Header 2": "Greedy Algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#2", "page_content": "This game isn't as simple as it seems, and it's not immediately obvious how to solve it. Most commonly people will\nstart with a greedy algorithm, which is to take the coin with the highest value at each turn. This is a good start,\nand will win in the example above, but it's not optimal. Consider the following example: \n\nThere are again 4 coins but with the values [5, 10, 25, 10]. \n1. Alice takes the right coin with value 10.\n2. Bob takes the right coin with value 25.\n3. Alice takes the right coin with value 10.\n4. Bob takes the last coin with value 5. \nAlice will have a total value of 20, and Bob will have a total value of 30. Bob wins!\n \nBy tweaking the greedy algorithm, we can get an algorithm that will always win, but not necessarily get the maximum\nvalue. Instead of taking the coin with the highest value, Alice first calculates the total value of coins in the odd\npositions, and then calculates the total value of coins in the even positions (starting at 0). She then takes the coin\nin the positions with the highest total sum. \n\nThere are again 6 coins but with the values [1,3,6,3,1,3]. First Alice calculates the total value of coins in the\neven positions, which is 1 + 6 + 1 = 8. Then she calculates the total value of coins in the odd positions, which\nis 3 + 3 + 3 = 9. So she takes the coins in the odd positions. If Bob uses the greedy approach we get the following: \n1. Alice takes the right coin with value 3 (original position=5).\n2. Bob takes the left coin with value 1.\n3. Alice takes the left coin with value 3 (original position=1).\n4. Bob takes the left coin with value 6.\n5. Alice takes the left coin with value 3 (original position=3).\n6. Bob takes the last coin with value 1. \nAlice will have a total value of 9, and Bob will have a total value of 8. Alice wins, but there is a way to get 10! \nIf Bob uses the same tweaked greedy approach as Alice, we get the following: \n1. Alice takes the right coin with value 3 (original position=5)."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#3", "metadata": {"Header 1": "Coins in a Line", "Header 2": "Greedy Algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#3", "page_content": "If Bob uses the same tweaked greedy approach as Alice, we get the following: \n1. Alice takes the right coin with value 3 (original position=5).\n2. Bob can't take an odd position coin, so he can take either coin as they are both odd positions and have the same value.\nLet's say he takes the left coin with value 1 because he built his algorithm to scan from left to right.\n3. Alice takes the left coin with value 3 (original position=1).\n4. Bob again can't take an odd position coin, but he takes the left coin with value 6 because it has a higher value\nthan the right coin with value 1.\n5. Alice takes the left coin with value 3 (original position=3)\n6. Bob takes the last coin with value 1. \nThe result is the same as if Bob used the normal greedy approach, because Alice always take the coins away from Bob\nas she gets to go first.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#4", "metadata": {"Header 1": "Coins in a Line", "Header 2": "Dynamic Programming Algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/coinsLine.mdx#4", "page_content": "We always assume that Bob will play optimally, optimally meaning that he will always take the coin which minimizes the\n**total amount** of coins that Alice can get."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx#1", "metadata": {"Header 1": "Introduction to DP", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx#1", "page_content": "Dynamic programming, short DP, is a problem-solving technique or more formally an algorithmic design paradigm just like \"divide\nand conquer\" or a \"greedy algorithm\". It is used to solve problems that can be broken down into sub-problems (just like\ndivide and conquer) which are then solved recursively. For a problem to be solved using dynamic programming, it must\nhave two properties: \n- **Overlapping Sub-problems**: When the problem is broken down into sub-problems, the same sub-problems are solved\nmultiple times, i.e. there is an overlap.\n- **Optimal Substructure**: When the most optimal solution for the original problem can be constructed using the\noptimal solutions of the sub-problems. \nWe can illustrate these two properties using the Fibonacci sequence. The Fibonacci sequence is defined as follows: \n```java\npublic int fib(int n) {\nif (n <= 1)\nreturn n;\nreturn fib(n - 1) + fib(n - 2);\n}\n``` \nWhen we illustrate the recursive calls of the `fib` function as a tree (always a good idea when working with dynamic\nprogramming problems), we can see that the same sub-problems are solved multiple times. For example for `fib(6)` we can\nsee that `fib(3)` is solved three times, so there is an overlap.\nThe other property, optimal substructure, is also satisfied. The optimal solution for `fib(6)` is constructed using the\noptimal solutions of `fib(5)` and `fib(4)`. \n \nFrom the tree above we can also see that the time complexity of the `fib` function is exponential, i.e. `O(2^n)`. This\nis because the same sub-problems are solved multiple times. As we will see later, dynamic programming can be used to\nimprove the time complexity of the `fib` function to `O(n)`. This is a huge improvement and is most often the reason why\ndynamic programming is used because it can drastically improve the time complexity of a function."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx#2", "metadata": {"Header 1": "Introduction to DP", "Header 2": "Top-Down Approach (Memoization)", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx#2", "page_content": "The top-down approach is the most common way to solve dynamic programming problems. It is also called **memoization**.\nThe idea is to store the results of the sub-problems so that we do not have to re-compute them when they are needed\nagain later due to the overlapping sub-problems property. This technique is called memoization because we store the\nresults of the sub-problems in a lookup table (memo). \nIt is called top-down because we still start with the original problem and break it down into sub-problems and solve\nthem recursively. \nWhen implementing memoization it is important to think about the data structure that will be used to store the results\nas we want quick lookups. This leads to most implementations using either just a simple array where the index is the\ninput to the function or a hash map where the key is the input to the function. \n```java\npublic int fib(int n) {\nif (n < 0)\nthrow new IllegalArgumentException(\"n must be greater than or equal to 0\");\nif (n <= 1)\nreturn n;\n\nInteger[] memo = new Integer[n + 1]; // This uses more memory than a simple array but is more convenient\n\n// base cases\nmemo[0] = 0;\nmemo[1] = 1;\nreturn fibMemo(n, memo);\n}\n\npublic int fibMemo(int n, int[] memo) {\nif (memo[n] != null)\nreturn memo[n];\nmemo[n] = fibMemo(n - 1, memo) + fibMemo(n - 2, memo);\nreturn memo[n];\n}\n``` \nAfter implementing the memoization technique, we can see in the tree below that the time complexity of the `fib`\nfunction is now `O(n)` as each sub-problem is only solved once. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx#3", "metadata": {"Header 1": "Introduction to DP", "Header 2": "Bottom-Up Approach (Tabulation)", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/introduction.mdx#3", "page_content": "The bottom-up approach is the other way to solve dynamic programming problems. It is also called **tabulation**. The\nidea is to solve the sub-problems first, i.e. some of the base cases, and then use the results of those sub-problems to\nsolve the original problem, hence the name bottom-up. This technique is called tabulation because we store the results\nof the sub-problems in a table (depending on the problem, this can be a 1D or 2D array). \nWhen implementing memoization it helped to visualize the recursive calls as a tree. When implementing tabulation it\nis also additionally helpful to visualize the results as a table or list (depending on the problem) to find a pattern. \nFor a visualisation of the tabulation technique I can recommend watching [this video](https://youtu.be/oBt53YbR9Kk?t=11513)\nat the 3:11:50 mark. The whole video is great and I can recommend watching it all and also the 4 part [video series by\nMIT on dynamic programming from 2020](https://www.youtube.com/watch?v=r4-cftqTcdI&t=7s). \nFor the Fibonacci sequence, we can see that the base cases are `fib(0)` and `fib(1)`. We can then use those results to\nthen iteratively solve the rest of the sub-problems until we reach the original problem. \n```java\npublic int fib(int n) {\nif (n < 0)\nthrow new IllegalArgumentException(\"n must be greater than or equal to 0\");\nif (n <= 1)\nreturn n;\n\nint[] memo = new int[n + 1];\n\n// base cases\nmemo[0] = 0;\nmemo[1] = 1;\n\nfor (int i = 2; i <= n; i++) {\nmemo[i] = memo[i - 1] + memo[i - 2];\n}\nreturn memo[n];\n}\n``` \nThe above code then again results in a time complexity of `O(n)`, much better than the original `O(2^n)`."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/knapsack.mdx#1", "metadata": {"Header 1": "Knapsack Problem", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/knapsack.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/knapsack.mdx#1", "page_content": "The knapsack problem is a very popular problem with many different variations. The problem is as follows: \n> Given a set of items, each with a weight and a value, determine which items you should pick to maximize the value\n> while keeping the overall weight smaller than the limit of your knapsack (backpack). \n \nSome popular variations of the knapsack problem are: \n- 0/1 Knapsack: You can either take an item or not take it.\n- Unbounded Knapsack: You can take an item multiple times.\n- Bounded Knapsack: You can take an item a limited number of times.\n- Fractional Knapsack: You can take a fraction of an item. \nThe [subset sum problem](./subsetSum) is a variation of the knapsack problem where the weight of each item is equal to its value and\nthe goal is not to maximize the value but to get a specific value and weight. In my definition of the subset sum problem I allowed\nan item to be used multiple times, so it is a variation of the unbounded knapsack problem. \n\nActually implement the knapsack problem with the different variations.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#1", "metadata": {"Header 1": "Subset Sum Problem", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#1", "page_content": "For the subset sum problem, we are given an array of integers and a target sum, to keep it simple we will assume that\nthe array only contains positive integers and that the target sum is also positive. We will also allow an element in the\narray to be used multiple times. \nFrom this input we can then ask the following questions: \n- Is there a subset of the array that sums to the target sum? I will call this the `canSum` problem.\n- How many subsets of the array sum to the target sum? I will call this the `countSum` problem.\n- If there is a subset that sums to the target sum, what is the subset? I will call this the `howSum` problem.\n- If there is a subset that sums to the target sum, what is the minimum number of elements in the subset? I will call\nthis the `bestSum` problem. \n \nIf we are given the array `[2, 3, 5]` and the target sum `8`, then the answers to the above questions are: \n- `canSum(8, [2, 3, 5]) = true`\n- `countSum(8, [2, 3, 5]) = 2` (the subsets are `[2, 2, 2, 2]` and `[3, 5]`)\n- `howSum(8, [2, 3, 5]) = [2, 2, 2, 2]`\n- `bestSum(8, [2, 3, 5]) = [3, 5]` \nAnd for the array `[2, 4]` and the target sum `7` we get: \n- `canSum(7, [2, 4]) = false`\n- `countSum(7, [2, 4]) = 0`\n- `howSum(7, [2, 4]) = null`\n- `bestSum(7, [2, 4]) = null` \nAnd for an example that is not so trivial, we can use the array `[1, 2, 5, 25]` and the target sum `100`: \n- `canSum(100, [1, 2, 5, 25]) = true`\n- `countSum(100, [1, 2, 5, 25]) = 154050750` seems about right\n- `howSum(100, [1, 2, 5, 25]) = [1,1,1,1,1...1]` (100 times) because of the order of the for loop\n- `bestSum(100, [1, 2, 5, 25]) = [25, 25, 25, 25]` \n \nThe subset sum problem is a very popular problem but also a very hard problem computationally. As will become clearer\nbelow the time complexity of the subset sum problem is `O(n^m)` where `n` is the length of the array and `m` is the\ntarget sum. This is because we have to try all possible combinations of the elements in the array to find a subset that"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#2", "metadata": {"Header 1": "Subset Sum Problem", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#2", "page_content": "below the time complexity of the subset sum problem is `O(n^m)` where `n` is the length of the array and `m` is the\ntarget sum. This is because we have to try all possible combinations of the elements in the array to find a subset that\nsums to the target sum. This is also why dynamic programming is so useful for this problem because it can drastically\nimprove the time complexity. \n\nWhat does it mean for a problem to be NP-complete? Is the subset sum problem NP-complete etc.?\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#3", "metadata": {"Header 1": "Subset Sum Problem", "Header 2": "Can Sum", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#3", "page_content": "Our first approach to this problem is most lightly a brute force approach. We can use recursion to solve this problem\nby trying to subtract each element in the array from the target sum and then recursively calling the function again with\nthe new target sum. If the target sum is 0 then we have found a subset that sums to the target, and we can return\ntrue. If the target sum is negative then we have not found a subset that sums to the target sum and we can return false.\nThese results are then propagated back up the call stack until we reach the original call (the parent node in the tree\nbecomes true if any of its children are true and otherwise false). \nWe can construct the following tree to visualize the recursive calls: \n \n```java\npublic boolean canSum(int targetSum, int[] numbers) {\nif (targetSum == 0)\nreturn true;\nif (targetSum < 0)\nreturn false;\n\nfor (int num : numbers) {\nint remainder = targetSum - num;\nif (canSum(remainder, numbers))\nreturn true;\n}\nreturn false;\n}\n``` \nFrom the tree above we can see that the time complexity of the `canSum` function is `O(n^m)` where `n` is the length of\nthe array (the number of children per node) and `m` is the target sum (the depth of the tree, which would be maximal if\nthe array contained a 1). We can improve the time complexity of the `canSum` function to `O(n*m)` by using memoization. \n```java\npublic boolean canSum(int targetSum, int[] numbers) {\nif (targetSum < 0)\nthrow new IllegalArgumentException(\"targetSum must be greater than or equal to 0\");\n\nboolean[] memo = new boolean[targetSum + 1];\nArrays.fill(memo, false); // not needed but makes it more clear\nmemo[0] = true;\n\nreturn canSumMemo(targetSum, numbers, memo);\n}\n\npublic boolean canSumMemo(int targetSum, int[] numbers, boolean[] memo) {\nif (memo[targetSum])\nreturn true;\nif (targetSum < 0)\nreturn false;"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#4", "metadata": {"Header 1": "Subset Sum Problem", "Header 2": "Can Sum", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#4", "page_content": "return canSumMemo(targetSum, numbers, memo);\n}\n\npublic boolean canSumMemo(int targetSum, int[] numbers, boolean[] memo) {\nif (memo[targetSum])\nreturn true;\nif (targetSum < 0)\nreturn false;\n\nfor (int num : numbers) {\nint remainder = targetSum - num;\nif (canSumMemo(remainder, numbers, memo)) {\nmemo[targetSum] = true;\nreturn true;\n}\n}\nmemo[targetSum] = false;\nreturn false;\n}\n``` \nTo use tabulation instead of memoization we would need to construct a table (array) of size `targetSum + 1` and then\nfill it with the base cases and find some sort of pattern. So we would initially fill the list with `false` and then\nset the index 0 to `true` because the target sum 0 can always be constructed using an empty array. Then we need to\ndo something thinking to find the pattern. \nIf we think of our current position in the array as the target sum, i.e. in the first iteration we are at index 0, then\nwe know that we can construct the target sums where we add each number in the array to the current position. For example\nif we are at index 0 and the array is `[5,4,3]` and we have the target 7 then we know that we can construct the target\nsums 5,4 and 3 by adding the number at index 0 to the current position. So we can set the values at index 5, 4 and 3 to\n`true`. We can then move on and set our current index to 1 and we know that we can't construct the target sum 1 using\nthe array so we can skip it, same goes for index 2. But we can construct the target sum 3, so it gets interesting again.\nWe can then again add each number in the array to the current position and set the values at index 8, 7 and 6 to `true`.\nThis process continues until we reach the end of the array. If we then return the value at the last index we will have\nour result. \nThis [blog post](https://teepika-r-m.medium.com/dynamic-programming-basics-part-2-758b00e0a4b0) visualizes the process very well. \n```java\npublic boolean canSum(int targetSum, int[] numbers) {\nif (targetSum < 0)\nthrow new IllegalArgumentException(\"targetSum must be greater than or equal to 0\");"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#5", "metadata": {"Header 1": "Subset Sum Problem", "Header 2": "Can Sum", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#5", "page_content": "boolean[] table = new boolean[targetSum + 1];\nArrays.fill(table, false); // not needed but makes it more clear\ntable[0] = true;\n\nfor (int i = 0; i <= targetSum; i++) {\nif (table[i]) {\nfor (int num : numbers) {\nif (i + num < table.length)\ntable[i + num] = true;\n}\n}\n}\nreturn memo[targetSum];\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#6", "metadata": {"Header 1": "Subset Sum Problem", "Header 2": "Count Sum", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#6", "page_content": "The `countSum` problem is very similar to the `canSum` problem. The only difference is that when the target sum is 0 we\nreturn 1 instead of true and when the target sum is negative we return 0 instead of false and then in the parent node\nwe sum up the results of the children. \n \nThe brute force approach would look like this with a time complexity of `O(n^m)`: \n```java\npublic int countSum(int targetSum, int[] numbers) {\nif (targetSum == 0)\nreturn 1;\nif (targetSum < 0)\nreturn 0;\n\nint count = 0;\nfor (int num : numbers) {\nint remainder = targetSum - num;\ncount += countSum(remainder, numbers);\n}\nreturn count;\n}\n``` \nAnd the memoized version would look like this with a time complexity of `O(n*m)`: \n```java\npublic int countSum(int targetSum, int[] numbers) {\nif (targetSum < 0)\nthrow new IllegalArgumentException(\"targetSum must be greater than or equal to 0\");\n\nint[] memo = new int[targetSum + 1];\nArrays.fill(memo, -1);\nmemo[0] = 1;\n\nreturn countSumMemo(targetSum, numbers, memo);\n}\n\npublic int countSumMemo(int targetSum, int[] numbers, int[] memo) {\nif (targetSum < 0)\nreturn 0;\nif (memo[targetSum] != -1)\nreturn memo[targetSum];\n\nint count = 0;\nfor (int num : numbers) {\nint remainder = targetSum - num;\ncount += countSumMemo(remainder, numbers, memo);\n}\nmemo[targetSum] = count;\nreturn count;\n}\n``` \nOne issue is that it will count the same subset multiple times but with different ordering of the elements, as we can\nsee in the tree above."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#7", "metadata": {"Header 1": "Subset Sum Problem", "Header 2": "How Sum", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#7", "page_content": "The `howSum` problem is again a variation of the `canSum` problem. The only difference is that when the target sum is 0\nwe return an empty array instead of true and when the target sum is negative we return null instead of false and then\nin the parent node we return the array with the element that was used to get to the target sum. To solve this problem\nit doesn't matter if the array is the shortest or longest possible array that sums to the target sum it will just be\none of the possible solutions (The furthest left solution in the tree above because of the order of the for loop and\nthe recursive call). \n```java\npublic int[] howSum(int targetSum, int[] numbers) {\nif (targetSum == 0)\nreturn new int[0];\nif (targetSum < 0)\nreturn null;\n\nfor (int num : numbers) {\nint remainder = targetSum - num;\nint[] result = howSum(remainder, numbers);\nif (result != null) {\nint[] newArray = new int[result.length + 1];\nSystem.arraycopy(result, 0, newArray, 0, result.length); // O(n)\nnewArray[result.length] = num;\nreturn newArray;\n}\n}\nreturn null;\n}\n``` \nWith memoization: \n```java\npublic int[] howSum(int targetSum, int[] numbers) {\nif (targetSum < 0)\nthrow new IllegalArgumentException(\"targetSum must be greater than or equal to 0\");\n\nint[][] memo = new int[targetSum + 1][]; // will be jagged array\nArrays.fill(memo, null); // not needed but makes it more clear\nmemo[0] = new int[0];\n\nreturn howSumMemo(targetSum, numbers, memo);\n}\n\npublic int[] howSumMemo(int targetSum, int[] numbers, int[][] memo) {\nif (targetSum < 0)\nreturn null;\nif (memo[targetSum] != null)\nreturn memo[targetSum];\n\nfor (int num : numbers) {\nint remainder = targetSum - num;\nint[] result = howSumMemo(remainder, numbers, memo);\nif (result != null) {\nint[] newArray = new int[result.length + 1];\nSystem.arraycopy(result, 0, newArray, 0, result.length); // O(n)\nnewArray[result.length] = num;\nmemo[targetSum] = newArray;\nreturn newArray;\n}\n}\nmemo[targetSum] = null;\nreturn null;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#8", "metadata": {"Header 1": "Subset Sum Problem", "Header 2": "Best Sum", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#8", "page_content": "The `bestSum` problem is again a variation of the `howSum` problem. It is very similar to the `howSum` problem but\ninstead of returning the first array that sums to the target sum, we return the shortest array that sums to the target\nsum. \n```java\npublic int[] bestSum(int targetSum, int[] numbers) {\nif (targetSum == 0)\nreturn new int[0];\nif (targetSum < 0)\nreturn null;\n\nint[] shortestArray = null;\nfor (int num : numbers) {\nint remainder = targetSum - num;\nint[] result = bestSum(remainder, numbers);\nif (result != null) {\nint[] newArray = new int[result.length + 1];\nSystem.arraycopy(result, 0, newArray, 0, result.length); // O(n)\nnewArray[result.length] = num;\nif (shortestArray == null || newArray.length < shortestArray.length)\nshortestArray = newArray;\n}\n}\nreturn shortestArray;\n}\n``` \nWith memoization: \n```java\npublic int[] bestSum(int targetSum, int[] numbers) {\nif (targetSum < 0)\nthrow new IllegalArgumentException(\"targetSum must be greater than or equal to 0\");\n\nint[][] memo = new int[targetSum + 1][]; // will be jagged array\nArrays.fill(memo, null); // not needed but makes it more clear\nmemo[0] = new int[0];\n\nreturn bestSumMemo(targetSum, numbers, memo);\n}\n\npublic int[] bestSumMemo(int targetSum, int[] numbers, int[][] memo) {\nif (targetSum < 0)\nreturn null;\nif (memo[targetSum] != null)\nreturn memo[targetSum];\n\nint[] shortestArray = null;\nfor (int num : numbers) {\nint remainder = targetSum - num;\nint[] result = bestSumMemo(remainder, numbers, memo);\nif (result != null) {\nint[] newArray = new int[result.length + 1];\nSystem.arraycopy(result, 0, newArray, 0, result.length); // O(n)\nnewArray[result.length] = num;\nif (shortestArray == null || newArray.length < shortestArray.length)\nshortestArray = newArray;\n}\n}\nmemo[targetSum] = shortestArray;\nreturn shortestArray;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#9", "metadata": {"Header 1": "Subset Sum Problem", "Header 2": "All Sum", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/dynamicProgramming/subsetSum.mdx#9", "page_content": "The `allSum` problem is again a variation of the `canSum` problem, and is almost a combination of the `countSum` and\n`howSum` problems. However, it is a bit more complicated because we need to return a list of arrays instead of just one\nresult. \n\nCan't be bothered to implement this right now. Maybe later. Same goes for the tabulation versions of the above\nproblems.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#1", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#1", "page_content": "Vertex centrality measures can be used to determine the importance of a vertex in a graph. There are many different\nvertex centrality measures, each with their own advantages and disadvantages. In a communication network a vertex with\nhigh centrality is an actor that is important for the communication in the network, hence they are also often called\nactor centrality measures. An actor with high centrality can control the flow of information in the network for good or\nbad. They can also be used to determine key actors in a network, for example in a power grid it is important to know\nwhich vertices are key actors, because if they fail, the whole network fails."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#2", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Degree Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#2", "page_content": "The degree centrality is the simplest centrality measure. It is simply the number of edges connected to a vertex. The\ndegree centrality is a local measure, because it only takes into account the direct neighbors of a vertex. It can be\ncalculated using the $\\text{deg()}$ function. Or alternatively using the $\\text{indeg()}$ and $\\text{outdeg()}$\ndepending on whether the graph is directed or not and the use-case. \nexport const vertexDegreeGraph = {\nnodes: [\n{id: 1, label: \"2\", x: 0, y: 0},\n{id: 2, label: \"2\", x: 0, y: 200},\n{id: 3, label: \"3\", x: 200, y: 100, color: \"red\"},\n{id: 4, label: \"2\", x: 400, y: 100},\n{id: 5, label: \"3\", x: 600, y: 100, color: \"red\"},\n{id: 6, label: \"2\", x: 800, y: 0},\n{id: 7, label: \"2\", x: 800, y: 200}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 3},\n{from: 3, to: 4},\n{from: 4, to: 5},\n{from: 5, to: 6},\n{from: 5, to: 7},\n{from: 6, to: 7}\n]\n}; \n \nThe degree centrality can be normalized by dividing it by the maximum possible degree in the graph. This is rarely done\nin practice, because a lot of values will be small, and we are most often interested in the actual degree of a vertex. \nThe interpretation of the degree centrality is pretty self-explanatory. And is closely related to the\n[prestige](#prestige) of a vertex."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#3", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Closeness Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#3", "page_content": "Unlike the degree centrality, the closeness centrality is a global measure, because it takes into account the whole\ngraph, however the consequence of this is that it is more expensive to calculate. \n> The idea of the closeness centrality is that a vertex is important if it is close to the center of the graph. So a vertex\nis important if it is **close** to all other vertices in the graph, i.e. it is close to the center of the graph. \nThis also means that a vertex can be important even if it only has one edge. As seen by the green vertex in the following graph. \nexport const vertexDegreeProblemGraph = {\nnodes: [\n{id: 1, label: \"2\", x: 0, y: 0},\n{id: 2, label: \"2\", x: 0, y: 200},\n{id: 3, label: \"3\", x: 200, y: 100, color: \"red\"},\n{id: 4, label: \"2\", x: 400, y: 100},\n{id: 5, label: \"3\", x: 600, y: 100, color: \"red\"},\n{id: 6, label: \"2\", x: 800, y: 0},\n{id: 7, label: \"2\", x: 800, y: 200},\n{id: 8, label: \"1\", x: 400, y: 0, color: \"green\"}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 3},\n{from: 3, to: 4},\n{from: 4, to: 5},\n{from: 5, to: 6},\n{from: 5, to: 7},\n{from: 6, to: 7},\n{from: 4, to: 8},\n]\n}; \n \nThe closeness centrality for a vertex $v$ is calculated by taking the inverse distance of all shortest paths from the\nvertex $v$ to all other vertices in the graph. This can be interpreted as how efficiently can all the other vertices\nbe reached from $v$. The formula for the closeness centrality is as follows: \n$$\n\\text{closenessCentrality}(v) = \\sum_{u \\in V \\setminus \\{v\\}}{d(v,u)^{-1}} = \\sum_{u \\in V \\setminus \\{v\\}}{\\frac{1}{d(v,u)}}\n$$ \nWhere $d(v,u)$ is the length of the shortest path from $v$ to $u$. Let us calculate the closeness centrality for the\ngreen vertex in the graph above. \n$$\n\\begin{align*}\n1 + \\frac{1}{2} + \\frac{1}{2} + \\frac{1}{3} + \\frac{1}{3} + \\frac{1}{3} + \\frac{1}{3} &= \\frac{10}{3} \\\\\n\\frac{10}{3} \\cdot \\frac{1}{8-1} &= \\frac{10}{21} \\approx 0.476\n\\end{align*}\n$$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#4", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Closeness Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#4", "page_content": "green vertex in the graph above. \n$$\n\\begin{align*}\n1 + \\frac{1}{2} + \\frac{1}{2} + \\frac{1}{3} + \\frac{1}{3} + \\frac{1}{3} + \\frac{1}{3} &= \\frac{10}{3} \\\\\n\\frac{10}{3} \\cdot \\frac{1}{8-1} &= \\frac{10}{21} \\approx 0.476\n\\end{align*}\n$$ \nTo normalize the closeness centrality, it can be divided by $|V| - 1$. \nexport const vertexClosenessGraph = {\nnodes: [\n{id: 1, label: \"0.524\", x: 0, y: 0},\n{id: 2, label: \"0.524\", x: 0, y: 200},\n{id: 3, label: \"0.596\", x: 200, y: 100},\n{id: 4, label: \"0.714\", x: 400, y: 100, color: \"red\"},\n{id: 5, label: \"0.596\", x: 600, y: 100},\n{id: 6, label: \"0.524\", x: 800, y: 0},\n{id: 7, label: \"0.524\", x: 800, y: 200},\n{id: 8, label: \"0.476\", x: 400, y: 0, color: \"green\"}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 3},\n{from: 3, to: 4},\n{from: 4, to: 5},\n{from: 5, to: 6},\n{from: 5, to: 7},\n{from: 6, to: 7},\n{from: 4, to: 8},\n]\n}; \n \n\nThis gives different values to the formula from wikipedia and networkx. They use the following formula: \n$$\n\\text{closenessCentrality}(v) = \\frac{1}{\\sum_{u \\in V \\setminus \\{v\\}}{d(v,u)}}\n$$ \nand for the normalized closeness centrality: \n$$\n\\text{closenessCentrality}(v) = \\frac{|V| - 1}{\\sum_{u \\in V \\setminus \\{v\\}}{d(v,u)}}\n$$ \nwhere $d(v,u)$ is the length of the shortest path from $v$ to $u$. \nThe issue with the above formula is that if no path exists between $v$ and $u$ then the distance is $\\infty$ which\nwould lead to the closeness centrality being $0$. This could be solved by just using 0 instead of $\\infty$ which would\nlead to the same result as the formula above because 1 divided by $\\infty$ is $0$, i.e. 0 is added to the sum.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#5", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Betweenness Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#5", "page_content": "In the above example using the degree centrality we saw that the green ones are the most important. However,\nwe can clearly visually see that the vertex inbetween them is the most important one as it connects the two communities.\nBecause of this we could say that that vertex is in Brokerage position or is a Broker/Gatekeeper of information. \n \nThe betweenness centrality is a global measure that takes into account the whole graph and tries to solve the above\nissue. \n> The idea of the betweenness centrality is that a vertex is important if a lot of shortest paths go through it, i.e. it is\n> **between** a lot of vertices. \nTo calculate the betweenness centrality we need to calculate the number of shortest paths that go through a vertex $v$.\nSo for every pair of vertices $u$ and $w$ we need to calculate the shortest paths and then count how many of them go\nthrough $v$. The formula for the betweenness centrality is as follows: \n$$\n\\text{betweennessCentrality}(v) = \\sum_{u \\neq v \\neq w}{\\frac{\\sigma_{uw}(v)}{\\sigma_{uw}}}\n$$ \nWhere $\\sigma_{uw}$ is the number of shortest paths from $u$ to $w$ and $\\sigma_{uw}(v)$ is the number of shortest paths\nfrom $u$ to $w$ that go through $v$. \n\nThe fraction in the formula leads to the weight being split if there are multiple shortest paths between $u$ and $w$.\n \nBecause the calculations for the betweenness centrality are quite complex and take a while to calculate, we will use a\nsmaller graph to calculate the betweenness centrality. \n\nmake this more algorithmic and use the pictures from the script.\n \n\n\nWe start with all betweenness centralities being $0$. We start with the first vertex on the left and mark it green.\n\n\nWe then calculate the shortest path to the next one in a BFS manner. The vertex to the right is the next one so we"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#6", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Betweenness Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#6", "page_content": "We start with all betweenness centralities being $0$. We start with the first vertex on the left and mark it green.\n\n\nWe then calculate the shortest path to the next one in a BFS manner. The vertex to the right is the next one so we\nmark it green as the target vertex. Because it is directly connected to the other green one nothing changes. Now\nthat we have visited it we mark it gray.\n\n\nWe take the next vertex, the one above and mark it green. We then calculate the shortest path between the two green\nvertices. There is only one shortest path going over the previously visited gray vertex. So we add $1$ to that gray\nvertexes betweenness centrality.\n\n\nWe continue this process until we have visited all vertices once. We then mark the initial vertex on the left as red.\nAll shortest paths that start at this vertex have been calculated. We then pick a new start vertex in a BFS manner.\nRepeat the process until all shortest paths have been calculated.\n\n \nexport const vertexBetweennessGraph = {\nnodes: [\n{id: 1, label: \"0\", x: 0, y: 200},\n{id: 2, label: \"3\", x: 200, y: 200},\n{id: 3, label: \"1\", x: 400, y: 0},\n{id: 4, label: \"1\", x: 400, y: 400},\n{id: 5, label: \"0\", x: 600, y: 200},\n],\nedges: [\n{from: 1, to: 2},\n{from: 2, to: 3},\n{from: 2, to: 4},\n{from: 3, to: 4},\n{from: 3, to: 5},\n{from: 4, to: 5},\n]\n}; \n \nTo normalize the betweenness centrality, you devide by the centrality by following: \n- For an undirected graph: $\\frac{(n-1)(n-2)}{2}$\n- For a directed graph: $(n-1)(n-2)$ \nThe Image below summarizes all the centrality measures we have seen so far and compares the most central vertices. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#7", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Eigenvector Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#7", "page_content": "Before I start explaining the eigenvector centrality, I describe what an eigenvector is. An eigenvector is a\nvector that does not change its direction when multiplied by a square matrix, only its magnitude changes, i.e. it is only\nscaled. Because a matrix can have multiple eigenvectors, the solution is to allow for only eigenvectors with a magnitude of\n1, i.e. $||\\boldsymbol{v}||_2 = 1$, i.e. the normalized eigenvector. The scaling factor is then called the eigenvalue,\ndenoted by $\\lambda$. The formula for the eigenvector is as follows: \n$$\n\\boldsymbol{Av}=\\lambda \\boldsymbol{v}\n$$ \nThe eigenvector centrality is the eigenvector corresponding to the largest eigenvalue of the adjacency matrix of the\ngraph. The eigenvector corresponding to the largest eigenvalue is also commonly called the dominant eigenvalue/vector.\nThis can just be calculated but is most often calculated using the power iteration method. \nThe eigenvector centrality is an interesting centrality measure. \n> The idea is that a node is important if its neighbors are important. \nWhat makes a vertex important could be any attribute of the vertex, for example if we have\na network of people, their salary. However, the simplest and most commonly used approach is to use the degree\ncentrality as the importance measure. In an undirected graph most commonly the in-degree centrality. \nTo show the idea that the eigenvector centrality is based on the importance of the neighbors, I will use the following\ngraph and calculate the eigenvector centrality using the degree centrality as the importance measure with the power\niteration method. \n \n \n#### Power Iteration Method"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#8", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Eigenvector Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#8", "page_content": "width={400}\n/> \n \n#### Power Iteration Method \nThe power iteration method is a simple iterative method to calculate the eigenvector corresponding to the largest eigenvalue. \nThe idea is to start with an initial vector $\\boldsymbol{b_0}$ and then multiply it with the adjacency matrix $\\boldsymbol{A}$.\nThen we normalize the resulting vector $\\boldsymbol{b_1}$ and repeat the process until the vector converges. Most often to\ncheck for convergence we calculate the difference between the two vectors and check if it is smaller than a threshold. \n$$\n\\boldsymbol{b_{i+1}} = \\frac{\\boldsymbol{Ab_i}}{||\\boldsymbol{Ab_i}||_2}\n$$ \n\nThe initial vector $b_0$ in the power iteration method is the importance measure, in this case the degree centrality. However,\nthe initial vector can be any non-zero vector and the method will still converge to the same eigenvector. You could interpret\nthis as the eigenvector centrality being the \"true underlying importance\" of the vertices.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#9", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "PageRank", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#9", "page_content": "\nDo this\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#10", "metadata": {"Header 1": "Centrality", "Header 2": "Vertex Centrality", "Header 3": "Prestige", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#10", "page_content": "In a directed Graph it is possible to analyze the prestige of a vertex, i.e the stature or reputation associated with\na vertex. The vertices relationships however need to resemble this. For example, if a person has a lot of followers\nbut doesn't follow a lot of people, then that person has a high prestige and stature, for example a celebrity. \n#### Popularity \nThe simplest way to measure prestige is to count the number of incoming edges, i.e using the $\\text{indeg()}$ function.\nThis is called popularity. \nexport const localGraph = {\nnodes: [\n{id: 1, label: \"Bob, 1\"},\n{id: 2, label: \"Alice, 2\"},\n{id: 3, label: \"Michael, 4\", color: \"red\"},\n{id: 4, label: \"Urs, 2\"},\n{id: 5, label: \"Karen, 3\"},\n{id: 6, label: \"John, 2\"},\n{id: 7, label: \"Peter, 2\"},\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 1, to: 4},\n{from: 1, to: 5},\n{from: 2, to: 5},\n{from: 2, to: 6},\n{from: 2, to: 3},\n{from: 3, to: 4},\n{from: 3, to: 5},\n{from: 3, to: 6},\n{from: 3, to: 7},\n{from: 5, to: 1},\n{from: 5, to: 2},\n{from: 6, to: 3},\n{from: 6, to: 7},\n{from: 7, to: 3},\n],\n}; \n \n#### Proximity Prestige \nThe proximity prestige measure does not just account for the number of directly incoming edges, but also the number of\nindirectly incoming edges, i.e. the number of paths that lead to the vertex. However, the longer the path, the lower\nprestige from that path is weighted. \nSimply put the proximity prestige is the sum of all paths that lead to the vertex weighted by the length of the path. \nThe formula for the proximity prestige can be summarized pretty simply: \n> The proximity prestige of a vertex is the number of vertices that have a path to the vertex divided by the average\nshortest path length leading to the vertex. \nMore formally: \n$$\n\\text{proximityPrestige}(v) = \\frac{\\frac{|I|}{n-1}}{\\frac{\\sum_{i \\in I}{d(i,v)}}{|I|}}\n$$ \nWhere $I$ is the set of all vertices that have a path to $v$ and $d(u,v)$ is the length of the shortest path from $u$ to\n$v$. \n\n\n\n \n$$\n\\begin{align*}\n\\text{proximityPrestige}(2) &= \\frac{\\frac{1}{(8-1)}}{\\frac{1}{1}} = 0.14 \\\\\n\\text{proximityPrestige}(4) &= \\frac{\\frac{2}{(8-1)}}{\\frac{2}{2}} = 0.29 \\\\\n\\text{proximityPrestige}(6) &= \\frac{\\frac{7}{(8-1)}}{\\frac{10}{7}} = 0.7 \\\\\n\\end{align*}\n$$ \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#12", "metadata": {"Header 1": "Centrality", "Header 2": "Group Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#12", "page_content": "The goal of group centrality measures is to determine the importance of a group of vertices in a graph. These measures\nare based on the vertex centrality measures, but they are more complex and expensive to calculate."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#13", "metadata": {"Header 1": "Centrality", "Header 2": "Group Centrality", "Header 3": "Degree Group Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#13", "page_content": "The degree group centrality is the simplest group centrality measure. It is simply the fraction of the number of\nvertices outside the group that are directly connected to the group. So in the following graph with the group $G$ being\ndefined as $G={v_6,v_7,v_8}$ the degree group centrality would be $\\frac{3}{10}$ so $0.3$. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#14", "metadata": {"Header 1": "Centrality", "Header 2": "Group Centrality", "Header 3": "Closeness Group Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#14", "page_content": "The closeness group centrality measures how close the group is to the other vertices in the graph. It is calculted by\nadding up all inverse distances from the vertices outside the group to the closest vertex in the group. So in the\nsame graph and group $G={v_6,v_7,v_8}$ as above the closeness group centrality would be: \n$$\n1+1+1+\\frac{1}{2}+\\frac{1}{2}+\\frac{1}{2}+\\frac{1}{2}+\\frac{1}{2}+\\frac{1}{2}+\\frac{1}{3} = 6.333\n$$ \nIt can be simply normalized by dividing it by the number of vertices outside the group, which would lead to $0.6333.$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#15", "metadata": {"Header 1": "Centrality", "Header 2": "Group Centrality", "Header 3": "Betweenness Group Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#15", "page_content": "The betweenness group centrality measures how many shortest paths go through the group. It is calculated by counting\nhow many shortest paths between all the vertices outside the group go through the group. \n\nIf we define our group to contain the vertices $C,E$ from the graph below we can calculate the betweenness group\ncentrality simply by calculating all the shortest paths between the vertices outside the group and counting how many\nof them go through the group. \n \nWe have the following shortest paths between the vertices outside the group: \n- $A \\rightarrow B$\n- $A \\rightarrow C \\rightarrow D$ goes through the group via $C$.\n- $A \\rightarrow C \\rightarrow D \\rightarrow E \\rightarrow G$ goes through the group via $C$ and $E$.\n- $A \\rightarrow C \\rightarrow D \\rightarrow E \\rightarrow F$ goes through the group via $C$ and $E$.\n- $B \\rightarrow C \\rightarrow D$, goes through the group via $C$.\n- $B \\rightarrow C \\rightarrow D \\rightarrow E \\rightarrow F$ goes through the group via $C$ and $E$.\n- $B \\rightarrow C \\rightarrow D \\rightarrow E \\rightarrow G$ goes through the group via $C$ and $E$.\n- $D \\rightarrow E \\rightarrow G$ goes through the group via $E$.\n- $D \\rightarrow E \\rightarrow F$ goes through the group via $E$.\n- $F \\rightarrow G$ \nTherefore 8 of the 10 shortest paths go through the group, so the betweenness group centrality is $\\frac{8}{10} = 0.8$. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#16", "metadata": {"Header 1": "Centrality", "Header 2": "Network Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#16", "page_content": "The idea of network centrality is to measure the centrality of the entire network, i.e. to compare the difference in\ncentrality between the vertices in the network. The goal is then to show how different the key vertices are from the\nrest of the network."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#17", "metadata": {"Header 1": "Centrality", "Header 2": "Network Centrality", "Header 3": "General Network Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#17", "page_content": "To calculate the network centrality the vertex centrality measures are used. For this Linton Freeman defined a general\nformula that returns a value between $0$ and $1$ with the following meanings: \n- $0$ means that all vertices have the same centrality, i.e. the network is a ring network.\n- $1$ means that one vertex has all the centrality, i.e. the network is a star network. \nexport const starGraph = {\nnodes: [\n{id: 1, label: \"1\"},\n{id: 2, label: \"2\"},\n{id: 3, label: \"3\"},\n{id: 4, label: \"4\"},\n{id: 5, label: \"5\"},\n{id: 6, label: \"6\"},\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 1, to: 4},\n{from: 1, to: 5},\n{from: 1, to: 6},\n],\n}; \nexport const ringGraph = {\nnodes: [\n{id: 1, label: \"1\"},\n{id: 2, label: \"2\"},\n{id: 3, label: \"3\"},\n{id: 4, label: \"4\"},\n{id: 5, label: \"5\"},\n{id: 6, label: \"6\"},\n],\nedges: [\n{from: 1, to: 2},\n{from: 2, to: 3},\n{from: 3, to: 4},\n{from: 4, to: 5},\n{from: 5, to: 6},\n{from: 6, to: 1},\n],\n}; \n\n\n\n\n\n\n\n\n\n\n\n \nThe formula is as follows: \n$$\n\\text{networkCentrality}(G) = \\frac{\\sum_{v \\in V}{C_{max} - C(v)}}{Star_n}\n$$ \nWhere:\n- $C(v)$ is the centrality function for a vertex $v$.\n- $C_{max}$ is the maximum centrality of all vertices in the graph, i.e $ C_{max}= \\argmax_{v \\in V}{C(v)}$.\n- The denominator $Star_n$ is the maximal sum of differences between\nthe centrality of a vertex and the maximum centrality of all vertices in the graph, i.e. if the graph was a star graph\nwith the same amount of vertices as the graph $G$, so $n=|V|$ (Is this always the case, no matter the centrality measure?). \nWith the definition above it is now logical why the value is $1$ when the graph is a star graph because the numerator and\ndenominator are the same. Whereas if the graph is a ring graph, i.e. all vertices have the same centrality, then the"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#18", "metadata": {"Header 1": "Centrality", "Header 2": "Network Centrality", "Header 3": "General Network Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#18", "page_content": "With the definition above it is now logical why the value is $1$ when the graph is a star graph because the numerator and\ndenominator are the same. Whereas if the graph is a ring graph, i.e. all vertices have the same centrality, then the\nsum of differences in the numerator is $0$ and the denominator is the maximum sum of differences, which leads to the\nvalue being $0$. \n\nDepending on the definition of the general formula the Sum in the nominator skips the vertex with the maximum\ncentrality since the difference would be $0$. I find the definition above more intuitive, but it is important to\nknow that there are different definitions.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#19", "metadata": {"Header 1": "Centrality", "Header 2": "Network Centrality", "Header 3": "Degree Network Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#19", "page_content": "For the degree network centrality the denominator is pretty simple, because for a star graph the key vertex will have a\ndegree of $n-1$ and the other vertices will have a degree of $1$. So the denominator is simply $(n-1)(n-2)$ for an\nundirected Graph, if it is a directed Graph then the nominator can just be doubled. \nIf you are working with the normalized degree centrality, then the denominator can be even further simplified to just\n$n-2$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#20", "metadata": {"Header 1": "Centrality", "Header 2": "Network Centrality", "Header 3": "Closeness Network Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#20", "page_content": "When using the normalized closeness centrality, the denominator is simply $\\frac{n-2}{2}$. I will save you the details\njust trust me bro."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#21", "metadata": {"Header 1": "Centrality", "Header 2": "Network Centrality", "Header 3": "Betweenness Network Centrality", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/centrality.mdx#21", "page_content": "When using the normalized betweenness centrality, the denominator is simply $n-1$, just like with the degree centrality."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#1", "metadata": {"Header 1": "Communities", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#1", "page_content": "Communities are subgraphs (subsets or groups of vertices of the original graph), that are better connected to each\nother than to the rest of the graph. Communities are very important when analyzing social networks and networks in\ngeneral as they often form around a context or a topic such as family, friends, work, hobbies, etc. \nThese communities can then be further analyzed such as to find out who are the most important people in a community,\nwhat is there impact on the community, and how do they relate to other communities. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#2", "metadata": {"Header 1": "Communities", "Header 2": "Neighborhoods", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#2", "page_content": "The neighborhood of a vertex $v$ is the set of all vertices that are connected to $v$ by an edge, and is denoted by\n$N(v)$ or $N_G(v)$ if the graph is not ambiguous. The neighborhood of a vertex is also sometimes referred to as the open\nneighborhood when it does not include the vertex itself $v$, and the closed neighborhood when it does include the vertex\nitself. The default is the open neighborhood, whereas the closed neighborhood is denoted by $N[v]$ or $N_G[v]$. \nexport const neighborhoodGraph = {\nnodes: [\n{id: 1, label: \"a\", x: 0, y: 0, color: \"green\"},\n{id: 2, label: \"b\", x: 0, y: 200, color: \"green\"},\n{id: 3, label: \"c\", x: 200, y: 100, color: \"red\"},\n{id: 4, label: \"d\", x: 400, y: 100, color: \"green\"},\n{id: 5, label: \"e\", x: 600, y: 100},\n{id: 6, label: \"f\", x: 800, y: 0},\n{id: 7, label: \"g\", x: 800, y: 200}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 3},\n{from: 3, to: 4},\n{from: 4, to: 5},\n{from: 5, to: 6},\n{from: 5, to: 7},\n{from: 6, to: 7}\n]\n}; \n\n\nFor the given Graph $G$ and the vertex $c$, the neighborhood $N[c]$ is the set of vertices $\\{a, b, d\\}$.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#3", "metadata": {"Header 1": "Communities", "Header 2": "Connected Components", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#3", "page_content": "Simply put a connected component is a subgraph of the original graph where all vertices are connected to each other. So\nThere are no disconnected vertices in a connected component. These can quiet easily be seen by eye but the definition\ncan become more complex when we look at directed graphs."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#4", "metadata": {"Header 1": "Communities", "Header 2": "Connected Components", "Header 3": "Undirected Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#4", "page_content": "In an undirected graph, a connected component is a subset of vertices such that there is a path between every pair of\nvertices in the subset. In other words, a connected component is a subgraph of the original graph where all vertices\nare connected to each other. \nThis could be useful for example to find out if a graph is fully connected or not. If the graph has only one connected\ncomponent, then it is fully connected. If it has more than one connected component, then it is not fully connected. \nIf we think of a communication network, then a connected component would be a group of people that can communicate with\neach other. If there are multiple connected components, then there are groups of people that cannot communicate with\neach other. \n \nTo find the connected components of a graph, we can simply use either a breadth-first search or a depth-first search\nover all vertices. The algorithm would then look something like this: \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#5", "metadata": {"Header 1": "Communities", "Header 2": "Connected Components", "Header 3": "Directed Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#5", "page_content": "In a directed graph the directions of the edges matter. This gives us two types of connected components, weakly\nconnected components and strongly connected components. \n#### Weakly Connected Components \nWeakly connected components are the same as connected components in an undirected graphs. So you just ignore the\ndirections of the edges. \n \n#### Strongly Connected Components \nStrongly connected components are a bit more complex. In a directed graph, a strongly connected component is a subset of\nvertices such that there is a path between every pair of vertices in the subset, but the path must follow the direction\nof the edges. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#6", "metadata": {"Header 1": "Communities", "Header 2": "Connected Components", "Header 3": "Giant Components", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#6", "page_content": "If a connected component includes a large portion of the graph, then it is commonly referred to as a\n**\"giant component\"**. There is no strict definition of what a giant component is, but it is commonly used to refer to\nconnected components that include more than 50% of the vertices in the graph."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#7", "metadata": {"Header 1": "Communities", "Header 2": "Cliques", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#7", "page_content": "Cliques focus on undirected graphs. A clique is a complete subgraph of the original graph, i.e a subgraph where all\nvertices are connected to each other. Cliques are very important in social networks as they represent groups of people\nthat all know each other, however in a communication network they would represent a group with redundant connections. \nBecause cliques are complete subgraphs, they are very easy to see but also happen to be very rare and hard to find\nalgorithmically. In the graph below the two cliques have been highlighted in red and blue. \n \n\nAlogrithms seem to be seperated in finding a maximal clique or finding cliques of a certain size.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#8", "metadata": {"Header 1": "Communities", "Header 2": "Cliques", "Header 3": "Clustering Coefficient", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#8", "page_content": "The clustering coefficient is a metric that measures how close a graph is to being a clique (don't ask me why it isn't\ncalled Clique Coefficient). There are two different versions of the clustering coefficient, the global clustering\ncoefficient and the local clustering coefficient, where the global clustering coefficient is just the average of the\nlocal clustering coefficients. \nThe idea behind the local clustering coefficient is to check how many of the neighbors of a vertex are connected to each\nother. If all neighbors are connected to each other, then the local clustering coefficient for that vertex is $1$. More\nformally, the local clustering coefficient for a vertex $v$ is defined as: \n$$\n\\text{localClusterCoeff}(v) = \\frac{2 \\cdot \\text{numEdgesBetweenNeighbors}(v)}{|N(v)| \\cdot (|N(v)| - 1)}\n$$ \nwhere $N(v)$ denotes the set of neighbors of $v$. \nexport const clusterCoeff1 = {\nnodes: [\n{id: 1, label: \"a\", x: 0, y: 100, color: \"red\"},\n{id: 2, label: \"b\", x: 100, y: 200, color: \"green\"},\n{id: 3, label: \"c\", x: 200, y: 100, color: \"green\"},\n{id: 4, label: \"d\", x: 100, y: 0, color: \"green\"},\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 1, to: 4},\n{from: 2, to: 3},\n{from: 2, to: 4},\n{from: 3, to: 4},\n]\n}; \nexport const clusterCoeff0 = {\nnodes: [\n{id: 1, label: \"a\", x: 0, y: 100, color: \"red\"},\n{id: 2, label: \"b\", x: 100, y: 200, color: \"green\"},\n{id: 3, label: \"c\", x: 200, y: 100, color: \"green\"},\n{id: 4, label: \"d\", x: 100, y: 0, color: \"green\"},\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 1, to: 4},\n]\n}; \n\n\n\n\nFor the given Graph $G$ and the vertex $a$, the cluster coefficient is $1$ because all neighbors are connected.\n\n\n\n\n\nFor the given Graph $G$ and the vertex $a$, the cluster coefficient is $0$ because none of the neighbors are\nconnected.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#9", "metadata": {"Header 1": "Communities", "Header 2": "Cliques", "Header 3": "Clustering Coefficient", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#9", "page_content": "\n\n\n\n\nFor the given Graph $G$ and the vertex $a$, the cluster coefficient is $0$ because none of the neighbors are\nconnected.\n\n\n \nThe global clustering coefficient is then just the average of the local clustering coefficients of all vertices in the\ngraph. \n$$\n\\text{globalClusterCoeff}(G) = \\frac{1}{|V|} \\sum_{v \\in V} \\text{localClusterCoeff}(v)\n$$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#10", "metadata": {"Header 1": "Communities", "Header 2": "Cliques", "Header 3": "k-Core", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#10", "page_content": "For a k-Core the rules of a clique are slightly relaxed. A k-Core is a subgraph where all vertices are at least connected\nto $k$ other vertices in the subgraph. \n \nAlthough this is a relaxation of the rules, it is still a very strict rule and\ncan lead to vertices that don't fulfill the $k$ connections but are only connected to other vertices in a core to not\nbe included in the core. \n\nDegeneracy of a graph and k-degenerate graphs is exactly this\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#11", "metadata": {"Header 1": "Communities", "Header 2": "Cliques", "Header 3": "p-Clique", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#11", "page_content": "The idea of a p-clique is also to relax the rules of a clique whilst also solving the above-mentioned issue of the\nk-core. In a p-clique, the p stands for a percentage in decimal i.e. a ratio. So in a p-clique at least the given\npercentage of edges of a vertices must be connected to other vertices in the subgraph. \nSo if we have a 0.5-clique, then at least 50% of the edges of a vertex must be connected to other vertices in the subgraph. This\nthen allows for the vertices that don't fulfill the rule to be included in the subgraph for a k-core but are only\nconnected to other vertices in the subgraph to be included in the subgraph."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#12", "metadata": {"Header 1": "Communities", "Header 2": "Cliques", "Header 3": "n-Clique", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#12", "page_content": "\nSometimes cliques are named after the number of vertices they contain. For example a clique with 3 vertices is\ncalled a 3-clique, a clique with 4 vertices is called a 4-clique, etc. this can be generalized to a k-clique. Not an\nn-clique though, that is something else, but when it just says 4-clique it can be ambiguous.\n \nThe idea of an n-clique is that we want a maximal subgraph, i.e. with the most vertices, where each pair of vertices can\nbe connected by a path of length at most n. So a 1-clique is just a normal clique, a 2-clique is a clique where each\npair of vertices can be connected by a path of length at most 2, etc. \n\nThe path doesn't have to be the shortest path, just a path of length at most n. And the path can go over any vertex,\nnot just vertices that are part of the clique.\n \nThis can lead to two interesting scenarios: \n1. The diameter of the subgraph can actually be longer then n. This is due to the path being able to go over any vertex,\nnot just vertices that are part of the clique. So in the example below, the diameter of the subgraph is 3 even though it\nis a 2-clique. \n \n2. The subgraph can be disconnected. In the example below you can see two possible 2-cliques of many for the given graph.\nInterestingly, they are both disconnected, because if one of the vertices inbetween is included, then a different vertex\ncan no longer be included. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#13", "metadata": {"Header 1": "Communities", "Header 2": "Clustering", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#13", "page_content": "In general clustering is the process of grouping similar objects together. In graph theory, the clustering process can\nbe seen as a way to group vertices together i.e. to find communities that aren't based on specific rules like cliques\nor connected components. \nThere are two main approaches to clustering graphs: \n- bottom-up: start with each vertex in its own cluster and then merge clusters together\n- top-down: start with all vertices in one cluster and then split the cluster into smaller clusters"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#14", "metadata": {"Header 1": "Communities", "Header 2": "Clustering", "Header 3": "Girvan-Newman Clustering", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#14", "page_content": "The Girvan-Newman clustering algorithm is a bottom-up approach to clustering which is based on edge betweenness, hence\nit is also called edge betweenness clustering. The idea is to iteratively calculate the edge-betweenness of each edge in\nthe graph and then remove the edge with the highest edge-betweenness. \nThe thought process behind this is that the edges with the highest edge-betweenness are the edges that have the highest\ninformation flow. So by removing these edges, we are removing the edges that connect two groups/clusters/communities\ntogether. Eventually this will lead to two components, which are then the clusters. \n\n\n\n\n\n\n\n \nThe issue with this approach is that it is very computationally expensive. The edge-betweenness of each edge has to be\ncalculated, which is $O(|V||E|)$ and then that has to be done iteratively multiple times so the overall complexity can\nbe summarized to $O(n^3)$ which is not ideal for large graphs."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#15", "metadata": {"Header 1": "Communities", "Header 2": "Clustering", "Header 3": "LPA - Label Propagation Algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#15", "page_content": "The LPA is a more general algorithm which doesn't have to just be used for clustering graphs it can also be used to\ncluster data in general. However, I will explain it in the context of graph clustering. \n\nMaybe one day this can be done in the context of semi-supervised labeling\n \nLPA consists of 2 parts, the preparation and the actual algorithm. In the preparation we do the following: \n1. We assign each vertex a unique label from $0$ to $|V| - 1$. The labels in the end will be the clusters, which makes\nthis a bottom-up approach. \n2. We perform graph coloring. I will not go into detail about graph coloring here, but the idea is to color the graph\nsuch that no two connected/neighboring vertices have the same color whilst using the least amount of colors possible. \n\nMaybe add a link to the graph coloring chapter if it ever gets written.\n \nOnce the preparation is done, we can start the actual algorithm. The algorithm is very simple: \nFor each color (always in the same order) we go through each vertex (also always in the same order) and check the\nlabels of its neighbors and count how many times each one occurs. If there is a label that occurs more often than the\nothers, then we assign that label to the vertex. If there are multiple labels that occur the same amount of times, then\nthere are two options: \n- If the vertexes label is one of the labels that occur the most, then we keep the label.\n- If the vertexes label is not one of the labels that occur the most, then we assign it the label with the highest value.\nLowest would also work, as long as it is consistent. \nThis is repeated until the labels don't change anymore. The labels in the end then represent the clusters. The algorithm\nis very simple and fast making it a good choice for large graphs. However, it is not\ndeterministic, i.e. it can lead to different results depending on the order of the colors and vertices. This can be"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#16", "metadata": {"Header 1": "Communities", "Header 2": "Clustering", "Header 3": "LPA - Label Propagation Algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#16", "page_content": "is very simple and fast making it a good choice for large graphs. However, it is not\ndeterministic, i.e. it can lead to different results depending on the order of the colors and vertices. This can be\nmitigated by running the algorithm multiple times and then taking the most common result. \n\nAfter the initial setup, we get the graph below: \n \nWe will work through the graph in the following order: \n- Blue: $B, F$\n- Green: $D, A, H, C$\n- Brown: $E, G$ \n\n\nWe start with vertex $B$ which has the neighbors $A,C,D,E$ with the labels $0,2,3,4$. The vertex $B$ has the\nlabel $1$. Because all the neighboring labels occur once and the vertexes label is not one of the labels we\npick the one with the height value, which is $4$. So we assign the label $4$ to $B$.\n\n\nWe have a similar situation for the next vertex $F$ which gets assigned the label $7$.\n\n\n\nNow we do the same with the green vertices.\n\n\n\nLastly, we process the brown vertices in the given order.\n\nLuckily with this graph, we already have our clusters after the first iteration. We have two clusters, the\nvertices with the label 4 and the vertices with the label 7.\n\n \n \n\nMake my own images where the graph is processed alphabetically. And what if we want more then 2 clusters?\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#17", "metadata": {"Header 1": "Communities", "Header 2": "Clustering", "Header 3": "Louvain Clustering", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#17", "page_content": "The Louvain clustering algorithm is a bottom-up greedy approach to clustering which is based on modularity. So we first\nneed to understand what modularity is. \n#### Modularity \nModularity is a metric that measures the quality of a clustering. The idea is to compare the number of edges within a\ncluster with the number of edges between clusters. A good clustering would then have a lot of edges within a cluster\nand not many edges between clusters. \nModularity is defined as the fraction of edges of a graph within a cluster minus the expected fraction of edges within\na cluster if the edges were distributed randomly. The value of modularity is between $\\frac{-1}{2}$ and $1$, where\nany value above 0 means that the number of edges within a cluster is higher than the expected number of edges within a\ncluster if the edges were distributed randomly. The higher the value, the better the clustering, if the value is above\n0.3 then the clustering is considered to be good. \n$$\n\\text{modularity}(G) = \\frac{1}{2m} \\sum_{i,j \\in V} \\left( A_{ij} - \\frac{deg(i) deg(j)}{2m} \\right) \\delta(c_i, c_j)\n$$ \nwith the following definitions: \n- $A_{ij}$ is the weight of the edge between vertices $i$ and $j$\n- $m$ is the sum of all edge weights so for an unweighted graph $m = |E|$ and for a weighted graph $m = \\sum_{i,j \\in V} A_{ij}$.\n- $\\delta(c_i, c_j)$ is the Kronecker delta function (1 if $c_i = c_j$ and 0 otherwise), which is used to check if two\nvertices are in the same cluster. \n#### The Louvain Algorithm \nThe Louvain algorithm then tries to maximize the modularity of a graph in an iterative process until the modularity\ncannot be increased anymore, hence it is a greedy approach. \nInitially each vertex is in its own cluster. We then iteratively perform the following steps: \n- **Modularity Optimization:** For each vertex we check how the modularity would change if we would\nmove it to a neighboring cluster. If the modularity would increase, then we move the vertex to the neighboring cluster"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#18", "metadata": {"Header 1": "Communities", "Header 2": "Clustering", "Header 3": "Louvain Clustering", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/communities.mdx#18", "page_content": "- **Modularity Optimization:** For each vertex we check how the modularity would change if we would\nmove it to a neighboring cluster. If the modularity would increase, then we move the vertex to the neighboring cluster\nwhich would increase the modularity the most. If the modularity would not increase, then we leave the vertex in its\ncurrent cluster. Once we have gone through all vertices, we move on to the next step.\n- **Cluster Aggregation:** We then aggregate all vertices in the same cluster into a single vertex. This vertex has a\nself-looping edge with a weight equal to the sum of all the edges of the vertices in the cluster. The vertices resembling\nthe clusters are then connected to each other with edges of weight equal to the sum of all the edges between the\nclusters before the aggregation. We then go back to the first step and repeat the process until the modularity cannot be\nincreased anymore. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#1", "metadata": {"Header 1": "Connectivity", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#1", "page_content": "Graph connectivity is also known as graph resilience and is a measure of how well a graph can maintain its connectivity\nwhen vertices or edges are removed, i.e. how many vertices or edges can be removed before the graph becomes disconnected\n(from one connected component to multiple connected components) or has a higher number of connected components. \nWith this analysis technique we can find out how robust a graph is, i.e. how well it can handle failures which can be\nvery useful in real world applications such as communication, transportation, etc."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#2", "metadata": {"Header 1": "Connectivity", "Header 2": "Bridges", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#2", "page_content": "A Bridge is an edge that if removed would increase the number of connected components in the graph. In the graph below\nyou can quiet clearly see that the edge between vertices $3$ and $4$ marked in red is a bridge. \nexport const bridgeGraph = {\nnodes: [\n{id: 1, label: \"1\", x: 0, y: 0},\n{id: 2, label: \"2\", x: 0, y: 200},\n{id: 3, label: \"3\", x: 200, y: 100},\n{id: 4, label: \"4\", x: 400, y: 100},\n{id: 5, label: \"5\", x: 600, y: 0},\n{id: 6, label: \"6\", x: 600, y: 200}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 3},\n{from: 3, to: 4, color: \"red\", width: 5},\n{from: 4, to: 5},\n{from: 4, to: 6},\n{from: 5, to: 6}\n]\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#3", "metadata": {"Header 1": "Connectivity", "Header 2": "Cut Vertices", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#3", "page_content": "The same idea as a bridge also applies to vertices. A vertex is a cut vertex if removing it would increase the number\nof connected components in the graph. In the graph below you can quiet clearly see that the vertices $3$ and $4$ are cut\nvertices. These cut vertices are very important vertices as they are brokers between different parts of the graph. \nexport const cutVerticesGraph = {\nnodes: [\n{id: 1, label: \"1\", x: 0, y: 0},\n{id: 2, label: \"2\", x: 0, y: 200},\n{id: 3, label: \"3\", value: 5, x: 200, y: 100, color: \"red\"},\n{id: 4, label: \"4\", value: 5, x: 400, y: 100, color: \"red\"},\n{id: 5, label: \"5\", x: 600, y: 0},\n{id: 6, label: \"6\", x: 600, y: 200}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 3},\n{from: 3, to: 4},\n{from: 4, to: 5},\n{from: 4, to: 6},\n{from: 5, to: 6}\n]\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#4", "metadata": {"Header 1": "Connectivity", "Header 2": "k-Connected Graphs", "Header 3": "k-Vertex-Connected Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#4", "page_content": "A graph is $k$-vertex-connected if it has at least $k+1$ vertices and at least $k$ vertices have to be removed to disconnect\nthe graph. \nThe vertex connectivity of a graph $G$ is the largest $k$ such that $G$ is $k$-vertex-connected. So for example the graph\nbelow has a vertex connectivity of 2, because it is 2-vertex-connected. If we remove the vertices $4$ and $2$ the graph\nbecomes disconnected but if we only remove one vertex the graph stays connected. \nexport const vertexConnectedGraph = {\nnodes: [\n{id: 1, label: \"1\", x: 0, y: 100},\n{id: 2, label: \"2\", value: 5, x: 200, y: 0, color: \"red\"},\n{id: 3, label: \"3\", x: 200, y: 200},\n{id: 4, label: \"4\", value: 5, x: 400, y: 100, color: \"red\"},\n{id: 5, label: \"5\", x: 600, y: 100},\n{id: 6, label: \"6\", x: 800, y: 200},\n{id: 7, label: \"7\", x: 800, y: 0},\n{id: 8, label: \"8\", x: 1000, y: 100}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 4},\n{from: 3, to: 4},\n{from: 4, to: 5},\n{from: 5, to: 6},\n{from: 5, to: 7},\n{from: 6, to: 8},\n{from: 7, to: 8},\n{from: 2, to: 7},\n]\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#5", "metadata": {"Header 1": "Connectivity", "Header 2": "k-Connected Graphs", "Header 3": "k-Edge-Connected Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/connectivity.mdx#5", "page_content": "The same idea as for vertex connectivity also applies to edge connectivity. A graph is $k$-edge-connected if it has at\nleast $k+1$ vertices and at least $k$ edges have to be removed to disconnect the graph. So the graph below is 2-edge-connected\nand also has an edge connectivity of 2. If we remove the edges $(2,5)$ and $(4,5)$ the graph becomes disconnected. \nexport const edgeConnectedGraph = {\nnodes: [\n{id: 1, label: \"1\", x: 0, y: 100},\n{id: 2, label: \"2\", x: 200, y: 0},\n{id: 3, label: \"3\", x: 200, y: 200},\n{id: 4, label: \"4\", x: 400, y: 100},\n{id: 5, label: \"5\", x: 600, y: 100},\n{id: 6, label: \"6\", x: 800, y: 200},\n{id: 7, label: \"7\", x: 800, y: 0},\n{id: 8, label: \"8\", x: 1000, y: 100}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 4},\n{from: 3, to: 4},\n{from: 4, to: 5, color: \"red\", width: 5},\n{from: 5, to: 6},\n{from: 5, to: 7},\n{from: 6, to: 8},\n{from: 7, to: 8},\n{from: 2, to: 5, color: \"red\", width: 5},\n]\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#1", "metadata": {"Header 1": "Diffusion", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#1", "page_content": "In networks, we can model the spread of information, disease, or other phenomena as a diffusion process. The diffusion\nprocess usually starts with an initial node or a set of initial nodes. The goal is then to model how the information\nspreads through the network. You can imagine why this would be important for modeling the spread of a disease or an\nadvertising campaign on social media where the goal is to reach as many people as possible."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#2", "metadata": {"Header 1": "Diffusion", "Header 2": "Innovation Diffusion", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#2", "page_content": "Already in 1962, Everett Rogers published a book called \"Diffusion of Innovations\" where he describes the spread of a\nnew idea or technology through a population. He split the adoption of a new idea into five stages: \n- **Knowledge/Awareness**: The individual is exposed to the innovation and gains knowledge of the innovation.\n- **Persuasion**: The individual is interested in the innovation and actively seeks information about the\ninnovation.\n- **Decision**: The individual makes a decision to adopt or reject the innovation.\n- **Implementation**: The individual implements the innovation and uses it as a trial.\n- **Confirmation**: The individual finalizes his/her decision to continue using the innovation. \nWhen analyzing the spread of a new innovation, Rogers found that the adoption of a new innovation follows a normal\ndistribution. \n- **Innovators 2.5%**: Innovators are the first individuals to adopt an innovation. Innovators are most often young\nand willing to take risks and have a high social status.\n- **Early Adopters 13.5%**: This is the second-fastest category of individuals who adopt an innovation. These individuals\nhave the highest degree of opinion leadership among the other adopter categories. Early adopters take more time to\nadopt an innovation than innovators due to more careful deliberation.\n- **Early Majority 34%**: Individuals in this category adopt an innovation after a varying degree of time. Most often,\nthe early majority waits to adopt an innovation until they see that the innovation has proven useful for others and are\nin contact with the early adopters.\n- **Late Majority 34%**: Individuals in this category will adopt an innovation after the average member of the society.\nThese individuals approach an innovation with a high degree of skepticism.\n- **Laggards 16%**: Individuals in this category are the last to adopt an innovation. Most often bound by traditions. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#3", "metadata": {"Header 1": "Diffusion", "Header 2": "Innovation Diffusion", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#3", "page_content": "- **Laggards 16%**: Individuals in this category are the last to adopt an innovation. Most often bound by traditions. \n \n\nWe can easily give some examples for the above distribution for when the first iPhone was released: \n- **Innovators (2.5%)**: These were the tech enthusiasts who camped outside Apple stores. They were excited and were\nwilling to embrace the new technology despite its high price and limited features compared to today's standards.\n- **Early Adopters (13.5%)**: The early adopters included individuals who closely followed tech trends and were\nquick to purchase the iPhone once they saw the positive reviews and early adopter experiences. They recognized the\niPhone's potential to change the way people communicate and access information.\n- **Early Majority (34%)**: As the iPhone gained popularity and started to prove its utility, the early majority\njoined in. These individuals might have been initially hesitant but were swayed by the success stories of the early\nadopters.\n- **Late Majority (34%)**: The late majority were more cautious and waited until the iPhone became a mainstream\nproduct. They wanted to ensure that any initial bugs or issues were resolved and that the price had become more\naffordable. Their decision to adopt the iPhone was influenced by its widespread acceptance and integration into daily\nlife.\n- **Laggards (16%)**: Laggards were the last to adopt the iPhone, often sticking with their traditional cell phones\nor resisting smartphones altogether. They were skeptical of the technology's benefits and preferred to maintain\ntheir existing routines and devices.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#4", "metadata": {"Header 1": "Diffusion", "Header 2": "ICM - Independent Cascade Model", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#4", "page_content": "The Independent Cascade Model (ICM) is a probabilistic diffusion model that is based on the idea that the spread of information\ntravels through neighbors in a network and therefore has a cascading effect. The model is based on the following assumptions: \n- A node can only effect its neighbors.\n- A node can only be in one of two states: active or inactive. For example, a node can be infected or not infected.\n- A node only has one chance to activate its neighbors.\n- A node can only go from inactive to active. \nThe initial setup of the model is as follows: \n- Each edge has an attribute $p \\in [0,1]$, which is the probability that the node will take over the state of its neighbor.\nHow this probability is calculated depends on the application. For example, in the case of a disease, the probability\ncould be based on a persons age and immune system. In the case of an advertising campaign, the probability could be\nbased on the number of friends that have already seen the ad. You could also just use random probabilities.\n- A set of nodes $S$ is selected as the initial set of active nodes. All other nodes are inactive. \nThe model then proceeds in discrete time steps. In each time step, the following happens: \n1. For each node $v \\in S$, the node tries to activate each of its neighbors $u$. The activation is successful with\nprobability $p_{vu}$ so if we generate a random value $r \\in [0,1]$ and it is smaller or equal to $p$. If the activation\nis successful, $u$ is added to the set $S_{new}$.\n2. If $S_{new}$ is empty then the process terminates. Otherwise, $S$ is updated to $S_{new}$ and the process repeats\nfrom step 1. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#5", "metadata": {"Header 1": "Diffusion", "Header 2": "ICM - Independent Cascade Model", "Header 3": "Spread Maximization", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#5", "page_content": "When working with the ICM model, we are often interested in finding the set of nodes $S$ that maximizes the spread, for\nexample in an advertising campaign. This is a [NP-Hard](../np) problem to solve, but we can use a greedy algorithm to find a good\nbut not necessarily optimal solution. (How is this an NP-Hard problem?) \nWe can denote the spread after the ICM model as $f(S)$ where $S$ is the set of initial nodes. The output of the function\nis the number of nodes that are active after the ICM model has finished. Using this we can then implement a greedy\nalgorithm that wants to maximize the spread, i.e. find the set of nodes $S$ that maximizes $f(S)$. \nHowever, we first need to change a few things about the ICM model to make it easier to work with because the model is\nnon-deterministic. Instead of using a random probability $p$ for each edge and then using a random number generator to\ndetermine if the edge is activated. We can instead use a fixed $p$ and fixed $r$ for each edge. Another possible approach\ncould be to define an \"activation function\" that takes the two nodes as input and defines if the edge is activated or not. \nFor example, we could define the activation function as follows: \n$$\na(u,v) = |u - v| \\leq 2\n$$ \nMost often, when wanting to maximize the spread, for example of an advertising campaign, we are also on a budget. This\nmeans that we can only select a limited number of nodes $k$ as the initial set of active nodes, i.e. $|S| \\leq k$. \nThe greedy algorithm then works as follows: \n\n\nInitialize $S = \\emptyset$.\n\n\nFor each vertex $v \\in V \\land v \\notin S$ compute $f(S \\cup \\{v\\})$.\n\n\nSelect the vertex $v$ where $f(S \\cup \\{v\\})$ is the highest and add it to $S$. If there are multiple vertices\nwith the same value, select one of them randomly.\n\n\nIf $|S| = k$ then terminate, otherwise repeat from step .\n\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#6", "metadata": {"Header 1": "Diffusion", "Header 2": "Linear Threshold Model", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#6", "page_content": "The threshold model is a diffusion model that is based on the idea that a node can only be activated if a certain\nproportion of its neighbors are already activated. The model is based on the same assumptions as the ICM model: \n- A node can only effect its neighbors.\n- A node can only be in one of two states: active or inactive. For example, a node can be infected or not infected.\n- A node only has one chance to activate its neighbors.\n- A node can only go from inactive to active. \nIn the model we define a threshold $t_v$ for each node $v$. The threshold is a value between $0$ and $1$ and defines\nthe proportion of neighbors that need to be active for the node to be activated. For example, if $t_v = 0.5$ then at\nleast half of the neighbors of $v$ need to be active for $v$ to be activated. \nFor the algorithm we then define an initial set of active nodes $S$ and then in each time step we do the following: \n\n\nFor each node $v \\in V \\land v \\notin S$ we compute the proportion of active neighbors $p_v$.\n\n\nIf $p_v \\geq t_v$ then we add $v$ to the set $S_{new}$.\n\n\nIf $S_{new}$ is empty then the process terminates. Otherwise, $S$ is merged with $S_{new}$, i.e. $S = S \\cup S_{new}$,\nand the process repeats back to the initial step .\n\n \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#7", "metadata": {"Header 1": "Diffusion", "Header 2": "Voter Model", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/diffusion.mdx#7", "page_content": "The voter model is a simple probabilistic diffusion model. To start the model, each node is assigned a random state\nwhich is either $0$ or $1$. In each time step, a node is selected at random and then one of its neighbors is also\nselected at random. The node then adopts the state of the selected neighbor. The process repeats until all nodes have\nthe same state."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/eulerianPath.mdx#1", "metadata": {"Header 1": "Eulerian Path", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/eulerianPath.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/eulerianPath.mdx#1", "page_content": "\nSeven Bridges of Königsberg and the Eulerian Path\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#1", "metadata": {"Header 1": "General Definition", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#1", "page_content": "A Graph is one of the most fundamental but also diverse data structures in computer science. A Graph consists of a set\nof vertices $V$ and a set of edges $E$ where each edge is an unordered pair. Hence, $G=(V,E)$. They are used to\nrepresent relationships between various entities or elements (the vertices) by connecting them with edges. \nFor example, a graph can be used to represent a social network where the vertices are people and the edges represent\nwhether they are friends with each other or not, no edge signifying that they are not friends. In the below graph\n$G=(V,E)$ where: \n- $V=\\{\\text{Bob, Alice, Michael, Urs, Karen}\\}$ and\n- $E=\\{(1,2),(1,3),(2,4),(2,5)\\}$ \nexport const friendsGraph = {\nnodes: [\n{id: 1, label: \"Bob\"},\n{id: 2, label: \"Alice\"},\n{id: 3, label: \"Michael\"},\n{id: 4, label: \"Urs\"},\n{id: 5, label: \"Karen\"}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 4},\n{from: 2, to: 5}\n]\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#2", "metadata": {"Header 1": "General Definition", "Header 2": "Metrics", "Header 3": "Degrees", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#2", "page_content": "If we do some quick analysis of this graph using the degree function which returns the number of edges connected to a\nvertex, we can see that $\\text{deg(Alice)}=3$ and therefore Alice has the most friends in this social network."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#3", "metadata": {"Header 1": "General Definition", "Header 2": "Metrics", "Header 3": "Order", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#3", "page_content": "The order of a graph is the number of vertices in the graph. So in the above example, the order of the graph is 5. So it\ncould also be called an order-5 graph."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#4", "metadata": {"Header 1": "General Definition", "Header 2": "Metrics", "Header 3": "Diameter", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#4", "page_content": "The diameter of a graph is the longest shortest path between two vertices in the graph. So in the above example, the\ndiameter of the graph is 3 as the longest shortest path is between Michael and Karen."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#5", "metadata": {"Header 1": "General Definition", "Header 2": "Metrics", "Header 3": "Density", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#5", "page_content": "The density of a graph is the ratio of the number of edges to the number of possible edges. So in other words how\ndensely connected the graph is. In a directed graph there are $|V|(|V|-1)$ possible edges. Which means the density of\na directed graph is: \n$$\nD = \\frac{|E|}{|V|(|V|-1)}\n$$ \nIn an undirected graph, there are $\\frac{|V|(|V|-1)}{2}$ possible edges. Which means the density of an undirected graph\nis: \n$$\nD = \\frac{|E|}{\\frac{|V|(|V|-1)}{2}} = \\frac{2|E|}{|V|(|V|-1)}\n$$ \nSo in the above example, the density of the graph is $\\frac{8}{20} = 0.4$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#6", "metadata": {"Header 1": "General Definition", "Header 2": "Graphs of Functions", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#6", "page_content": "You might be more familiar with Graphs when talking about mathematical functions. In mathematics, a Graph of a Function\nis a visual representation of the relationship between the input values (domain) and their corresponding output values\n(range) under a specific function. \nFormally, a Graph of a Function can be defined as follows: \nLet $f$ be a function defined on a set of input values, called the domain $D$, and taking values in a set of output\nvalues, called the range $R$. The Graph of the Function $f$, denoted as $G(f)$, is a mathematical representation\nconsisting of a set of ordered pairs $(x, y)$, where $x \\in D$ and $y = f(x)$. Each ordered pair represents a point on\nthe graph, with $x$ as the independent variable (input) and $y$ as the dependent variable (output). \nIn other words, the Graph of a Function is a visual representation of how the elements in the domain are mapped to\nthe corresponding elements in the range through the function $f$. \nFor example, consider the following function: \n$$\nf(x) = 2x + 1\n$$ \nIts domain could be the set of all real numbers $\\Bbb{R}$, and its range could also be $\\Bbb{R}$. To represent this\nfunction graphically, we plot points on the Cartesian plane where the $x$-coordinate corresponds to the input value,\nand the $y$-coordinate is the output value obtained by evaluating $f(x)$. \n\n
\nGraphs of Functions play a crucial role in analyzing and understanding the behavior of functions and studying their\noverall patterns and trends."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#7", "metadata": {"Header 1": "General Definition", "Header 2": "Types of Graphs", "Header 3": "Complete Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#7", "page_content": "A complete graph is a graph where each vertex is connected to every other vertex, very simple. In other words, a\ncomplete graph contains all possible edges. A complete graph with $n$ vertices is denoted as $K_n$. \nFor example, the below graph is a complete graph with 5 vertices, $K_5$. \nexport const completeGraph = {\nnodes: [\n{id: 1, label: \"1\"},\n{id: 2, label: \"2\"},\n{id: 3, label: \"3\"},\n{id: 4, label: \"4\"},\n{id: 5, label: \"5\"}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 1, to: 4},\n{from: 1, to: 5},\n{from: 2, to: 3},\n{from: 2, to: 4},\n{from: 2, to: 5},\n{from: 3, to: 4},\n{from: 3, to: 5},\n{from: 4, to: 5}\n]\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#8", "metadata": {"Header 1": "General Definition", "Header 2": "Types of Graphs", "Header 3": "Directed Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#8", "page_content": "A Directed Graph is a graph where each edge is directed from one vertex to another. In other words, the edges have a\ndirection. The previous graph was an example of an undirected graph as we would hope that if person A thinks of person B\nas a friend then Person B feels the same way. However, in a directed graph, this is not necessarily the case. \nSo we can define a directed graph as $G=(V,A)$ where: \n- V is again the set of vertices\n- A is a set of ordered pairs of vertices, called arcs, arrows or directed edges (sometimes simply edges with the\ncorresponding set named E instead of A). Ordered pairs are used here because (unlike for undirected graphs) the\ndirection of an edge $(u,v)$ is important: $(u,v)$ is not the same edge as $(v,u)$. The first vertex is called the\ntail or initial vertex and the second vertex is called the head or terminal vertex. \nFor example, let us imagine a directed graph where the vertices are the same as the previous example but the edges\nsignify if a person has liked another person's post on social media. We then get the below graph $G=(V,A)$ where: \n- $V=\\{\\text{Bob, Alice, Michael, Urs, Karen}\\}$ and\n- $A=\\{(1,2),(1,3),(2,4),(2,5),(5,2)\\}$ \nexport const postLikedGraph = {\nnodes: [\n{id: 1, label: \"Bob\"},\n{id: 2, label: \"Alice\"},\n{id: 3, label: \"Michael\"},\n{id: 4, label: \"Urs\"},\n{id: 5, label: \"Karen\"}\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n{from: 2, to: 4},\n{from: 2, to: 5},\n{from: 5, to: 2}\n]\n}; \n \nWhen talking about degrees in a directed graph, we more often distinguish between the in-degree and out-degree of a vertex.\nThe in-degree of a vertex is the number of edges that are pointing to that vertex and the out-degree is the number of\nedges that are pointing away from that vertex. So in the above example, the in-degree of Alice is 2 and the out-degree\nis also 2."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#9", "metadata": {"Header 1": "General Definition", "Header 2": "Types of Graphs", "Header 3": "Weighted Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#9", "page_content": "A weighted graph is a graph where each edge has a weight (a number associated with it). It can be as a triple\n$G = (E, V, w)$ where $w$ is a function that maps edges or directed edges to their weights. So, $w: E \\rightarrow \\Bbb{R}$\ncould be a function for a graph with real numbers as weights. \nFor example, in a graph where each city is a vertex and each edge is a road between two cities, the weight could\nbe the distance in km between them. \nexport const cityGraph = {\nnodes: [\n{id: 1, label: \"London\"},\n{id: 2, label: \"Paris\"},\n{id: 3, label: \"Berlin\"}\n],\nedges: [\n{from: 1, to: 2, value: 343, label: \"343\", length: 200},\n{from: 1, to: 3, value: 933, label: \"933\", length: 400},\n{from: 2, to: 3, value: 878, label: \"878\", length: 300},\n]\n} \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#10", "metadata": {"Header 1": "General Definition", "Header 2": "Types of Graphs", "Header 3": "Networks vs Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#10", "page_content": "The only real difference between a network and a graph is the terminology. A network is a graph with a real-world context.\nFor example, a social network is a graph with a real-world context. A graph is a mathematical structure that represents\nrelationships between objects. When talking about a network we also tend to talk about nodes and links instead of\nvertices and edges. \n| Graphs | Networks |\n|--------|----------|\n| Vertices | Nodes |\n| Edges | Links | \n[Ressource](https://bence.ferdinandy.com/2018/05/27/whats-the-difference-between-a-graph-and-a-network/)"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#11", "metadata": {"Header 1": "General Definition", "Header 2": "Types of Graphs", "Header 3": "Trees", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#11", "page_content": "A tree is a special type of graph where there is only one path between any two vertices. This means that there are no\ncycles in a tree. Trees are used in many different algorithms and data structures such as binary search trees. You can\nread more about trees in the [Trees](../trees/generalDefinition) section."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#12", "metadata": {"Header 1": "General Definition", "Header 2": "Types of Graphs", "Header 3": "Cycle/Circular Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#12", "page_content": "A cycle or circular graph is a graph with exactly one cycle. Where a cycle is non-empty path in which only the first and\nlast vertices are equal, i.e a path that starts and ends at the same vertex. You can think of it as a closed chain. \nA path/trail/walk in a graph is defined as being a sequence of vertices, where consecutive vertices in the sequence are\nconnected by an edge. \nCommonly a cycle with length $n$ is called an $n$-cycle and is denoted as $C_n$. \nFor example, the below graph is a circular graph as it has only one cycle with the following path: \n$$\nP = (a,b,c,a)\n$$ \nexport const cycleGraph = {\nnodes: [\n{id: 1, label: \"a\"},\n{id: 2, label: \"b\"},\n{id: 3, label: \"c\"}\n],\nedges: [\n{from: 1, to: 2},\n{from: 2, to: 3},\n{from: 3, to: 1},\n]\n} \n \nTo read more about cycle graphs, check out [this article](https://mathworld.wolfram.com/CycleGraph.html)"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#13", "metadata": {"Header 1": "General Definition", "Header 2": "Types of Graphs", "Header 3": "Acyclic Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/generalDefinition.mdx#13", "page_content": "An acyclic graph is a graph that is almost the opposite of a cycle graph. It is a graph that has no cycles. This means\nthat there is no path that starts and ends at the same vertex. A popular example of an acyclic graph is a tree. \nexport const acyclicGraph = {\nnodes: [\n{id: 5, label: \"5\", level: 0},\n{id: 2, label: \"2\", level: 1},\n{id: 6, label: \"6\", level: 1},\n{id: 1, label: \"1\", level: 2},\n{id: 4, label: \"4\", level: 2},\n{id: 8, label: \"8\", level: 2},\n{id: 3, label: \"3\", level: 3},\n{id: 7, label: \"7\", level: 3},\n{id: 9, label: \"9\", level: 3},\n],\nedges: [\n{from: 5, to: 2},\n{from: 5, to: 6},\n{from: 2, to: 1},\n{from: 2, to: 4},\n{from: 6, to: 8},\n{from: 4, to: 3},\n{from: 8, to: 7},\n{from: 8, to: 9},\n]\n} \nexport const acyclicOptions = {\nlayout: {\nhierarchical: {\nenabled: true,\ndirection: \"UD\"\n},\n},\n} \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx#1", "metadata": {"Header 1": "Graph Traversal", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx#1", "page_content": "The goal of graph traversal is to visit each vertex in a graph. This can be done a multitude of ways. \nThe general algorithm is as followed. We have a root vertex $s$, a set of all visited vertices $B$, a subset of $B$ which still has unvisited outgoing edges called $R$ and $O$ which holds the order in which the vertices were visited. \n```c\nadd $s$ to $R$ and set s.visited = true\n\nwhile R is not empty\ntake any vertex v in R\nif\nv has no unvisited outgoing edges remove v from R\nelse\nfollow a unvisited edge from v to w\nif !w.visited add w to R and set w.visited = true\n``` \nWith an adjacent list this takes $O(n+m)$ because each edge is followed once $O(m)$ and each vertex is added and removed from R $O(n)$. \nWith an adjacent matrix this takes $O(n^2)$ because the entire matrix has to be checked for edges."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx#2", "metadata": {"Header 1": "Graph Traversal", "Header 2": "DFS - Depth First Search", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx#2", "page_content": "A depth first search (DFS) visits the child vertices of a chose vertex before visiting the sibling vertices. In other words it traverses the depth of any particular path before exploring its breadth. A stack (often the program's call stack via recursion) is generally used when implementing this algorithm. \nThe algorithm begins with a root vertex it then transitions to an adjacent, unvisited vertex, until it can no longer find an unexplored vertex to transition to from its current location. The algorithm then backtracks until it finds a vertex connected to yet more unexplored vertices. This process carries on until the the algorithm has backtracked past the original \"root\" vertex from the very first step. \nSo if in the general algorithm we replace \"add w to R\" with \"call recursively dfs(w)\". We then get something like this \n```java\nvoid dfs(Vertex v) {\nprint(v); v.visited=true;\nfor (Vertex w : v.adjList) {\nif (!w.visited) {\ndfs(w);\n}\n}\n}\nvoid dfs_variante(Vertex v) {\nif (!v.visited) {\nprint(v); v.visited = true;\nfor (Vertex w : v.adjList) {\ndfs_variante(w);\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx#3", "metadata": {"Header 1": "Graph Traversal", "Header 2": "BFS - Breadth First Search", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/graphTraversal.mdx#3", "page_content": "Instead of searching down a single path until we can go no longer, we search all paths at a uniform depth, which is one unit, from the source before moving onto deeper paths. We will be adding vertices to the back of a queue to be searched from in the future. Thus, we start with our source vertex in the queue and then whenever we dequeue an item, we enqueue all of its \"new\" neighbours who are one unit away, so the queue stores all items of distance 1 from the source before all items who are distance 2 from the source, and so forth. \n```java\nvoid BFS(Vertex s) {\nQueue> R = new LinkedList>();\nprint(v); s.visited = true;\nR.add(s);\nwhile(!R.isempty()) {\nVertex v = R.remove();\nfor(Vertex w : v.adjList) {\nif(!w.visited) {\nprint(w); w.visited = true;\nR.add(w);\n}\n}\n}\n}\n``` \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#1", "metadata": {"Header 1": "Link Prediction", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#1", "page_content": "The idea of link prediction is as the name suggests to predict which link, i.e. edge will most likely be formed in the\nfuture for a given graph $G$. This is a very important problem in social network analysis and has many applications in\nthe real world such as in recommender systems to recommend friends, products, etc. to users. \n\nAdd visual examples\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#2", "metadata": {"Header 1": "Link Prediction", "Header 2": "Neighbourhood-based methods", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#2", "page_content": "There are many methods to predict links in a graph. One of the most basic methods is to use the neighbourhood of a node\nin some way to predict the links."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#3", "metadata": {"Header 1": "Link Prediction", "Header 2": "Neighbourhood-based methods", "Header 3": "Common Neighbours", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#3", "page_content": "This method is based of the common neighbours metric of two nodes. The idea behind it is that if two nodes have many\ncommon neighbours, then they are more likely to be connected in the future. You can imagine this as a friend of a friend\nis more likely to be your friend than a complete stranger. \nThe algorithm is very simple. For each pair of nodes $(u, v)$, we calculate the number of common neighbours $c$ and\nour prediction is the link/edge that corresponds to the pair with the highest $c$. \nTo calculate the number of common neighbours, we can use the following formula: \n$$\n\\text{commonNeighbours}(u, v) = |N[u] \\cap N[v]|\n$$ \nwhere $N[u]$ is the set of neighbours of node $u$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#4", "metadata": {"Header 1": "Link Prediction", "Header 2": "Neighbourhood-based methods", "Header 3": "Jaccard Index", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#4", "page_content": "The jaccard index or also commonly known as the jaccard similarity coefficient is a measure of similarity between two\nsets and is used in many different applications. In computer vision for example, it is used to compare the similarity\nof two images but is there more commonly known as the intersection over union (IoU) metric, because that is what it is. \n$$\n\\text{jaccardIndex}(u, v) = \\frac{|N[u] \\cap N[v]|}{|N[u] \\cup N[v]|}\n$$ \nWhen using the jaccard index for link prediction the idea is the same as with common neighbours. However, it takes into\naccount the size of the neighbourhoods of the two nodes. For example if two nodes have 100 neighbours and 5 of them are\ncommon, then they are less likely to be connected than two nodes with 10 neighbours and 5 of them are common. \n$$\n\\frac{5}{100} < \\frac{5}{10}\n$$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#5", "metadata": {"Header 1": "Link Prediction", "Header 2": "Neighbourhood-based methods", "Header 3": "Soundarajan-Hopcroft", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#5", "page_content": "If the community structure of a graph is known, i.e. which nodes belong to which community, then we can use this to\nimprove the common neighbours method for our link prediction. We do this by just adding the number of common neighbours\nthat are also in the same community to the common neighbours metric. So if two nodes have many common neighbours that\nare also in the same community, and another pair of nodes have many common neighbours but with less in the same\ncommunity, then the first pair is more likely to be connected in the future. \n$$\n\\text{soundarajanHopcroft}(u, v) = |N[u] \\cap N[v]| + \\sum_{w \\in N[u] \\cap N[v]} f(w)\n$$ \nwhere $f(w)$ is a function that returns 1 if $u$ and $v$ are in the same community and 0 otherwise."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#6", "metadata": {"Header 1": "Link Prediction", "Header 2": "Neighbourhood-based methods", "Header 3": "Resource Allocation Index", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#6", "page_content": "Imagine we have a resource for example a cake and there are three nodes, $x,y$ and $z$. $x$ and $y$ have the common\nneighbour $z$. If we want the cake to be shared between $x$ and $y$, then we can give it to $z$ and $z$ will then\nequally share it with its neighbours. So if $z$ has 10 neighbours, then $x$ and $y$ will each get $\\frac{1}{10}$ of\nthe cake. If $z$ has 100 neighbours, then $x$ and $y$ will each get $\\frac{1}{100}$ of the cake. \nThe resource allocation index is based on this idea and is defined as follows: \n$$\n\\text{resourceAllocationIndex}(u, v) = \\sum_{w \\in N[u] \\cap N[v]} \\frac{1}{deg(w)} = \\sum_{w \\in N[u] \\cap N[v]} \\frac{1}{|N[w]|}\n$$ \nThe resource allocation index can then be interpreted as a form of closeness between two nodes. We would then expect\nthat two nodes that are close to each other are more likely to be connected in the future."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#7", "metadata": {"Header 1": "Link Prediction", "Header 2": "Neighbourhood-based methods", "Header 3": "Adamic-Adar Index", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#7", "page_content": "The Adamic-Adar index is very similar to the resource allocation index. The only difference is that Adamic-Adar index\nweakens the denominator by taking the natural logarithm of the number of neighbours of a common neighbour. \n\nWhy is this done? I have no idea. I have not found any explanation for this. If you know why, please let me know.\nIt just seems to make the metric more complicated for no reason and the results larger.\n \n$$\n\\text{adamicAdarIndex}(u, v) = \\sum_{w \\in N[u] \\cap N[v]} \\frac{1}{\\ln(deg(w))} = \\sum_{w \\in N[u] \\cap N[v]} \\frac{1}{\\ln(|N[w]|)}\n$$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#8", "metadata": {"Header 1": "Link Prediction", "Header 2": "Neighbourhood-based methods", "Header 3": "Preferential Attachment", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#8", "page_content": "The preferential attachment method is based on the idea that nodes with a high degree are more likely to be effected\nby the addition of a new link than nodes with a low degree. The preferential attachment is defined as follows: \n$$\n\\text{preferentialAttachment}(u, v) = deg(u) \\cdot deg(v) = |N[u]| \\cdot |N[v]|\n$$ \nSo if we have two nodes with a high degree, then the preferential attachment will be high and if we have two nodes with\na low degree, then the preferential attachment and therefore the likelihood of them being connected in the future will\nbe low."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#9", "metadata": {"Header 1": "Link Prediction", "Header 2": "The Link Prediction Problem", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/linkPrediction.mdx#9", "page_content": "In the research paper [The Link Prediction Problem for Social Networks](https://www.cs.cornell.edu/home/kleinber/link-pred.pdf)\nfrom 2004 an experiment was conducted to test the performance of the different link prediction methods. The experiment\nused a network containing publications and authors from different research fields. \nThey had training networks from 1994 - 1996 and test networks from 1997 - 1999. \nThe goal was to predict which authors would publish together in the future. For this they extract the core, which\ncontained the authors that published at least 3 papers in the timeframe of the training networks and 3 papers in the\ntimeframe of the test networks. \nThey then used the different link prediction methods to predict which authors would publish together in the future but\nonly kept the highest predictions that connected 2 authors within the core. They then compared the predictions to the\nactual publications in the test networks core where the common neighbours method was the baseline."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#1", "metadata": {"Header 1": "Multirelational Graph", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#1", "page_content": "A multirelational graph is a graph where there are multiple types of edges. For example, in a social network, there\ncould be edges for friendships, likes, follows, etc. This then also most often leads to a multigraph, which is a graph\nwhere there can be multiple edges between two vertices. \nAn example of a multigraph could be a road network where each edge represents a road and the vertices represent cities. \nexport const multiGraph = {\nnodes: [\n{id: 1, label: \"London\", x: 150, y: 0},\n{id: 2, label: \"Paris\", x: 0, y: 200},\n{id: 3, label: \"Berlin\", x: 300, y: 200}\n],\nedges: [\n{from: 1, to: 2, label: \"Road 1\", smooth: {type: \"curvedCCW\", roundness: 0.5}},\n{from: 1, to: 2, label: \"Road 2\", smooth: {type: \"curvedCW\", roundness: 0.5}},\n{from: 1, to: 2, label: \"Road 3\", smooth: {type: \"continuous\", roundness: 0.2}},\n{from: 1, to: 3, label: \"Road 4\", smooth: {type: \"curvedCCW\", roundness: 0.3}},\n{from: 1, to: 3, label: \"Road 5\", smooth: {type: \"curvedCW\", roundness: 0.3}},\n{from: 2, to: 3, label: \"Road 6\", smooth: {type: \"continuous\", roundness: 0.2}}\n]\n} \n \nAs an example of a multirelational graph, we can expand on the graph from above and add edges for other types of\ntransport like trains. \nexport const multirelationalGraph = {\nnodes: [\n{id: 1, label: \"London\", x: 150, y: 0},\n{id: 2, label: \"Paris\", x: 0, y: 200},\n{id: 3, label: \"Berlin\", x: 300, y: 200}\n],\nedges: [\n{from: 1, to: 2, label: \"Road 1\", smooth: {type: \"curvedCCW\", roundness: 0.5}},\n{from: 1, to: 2, label: \"Road 2\", smooth: {type: \"curvedCW\", roundness: 0.5}},\n{from: 1, to: 2, label: \"Plane 1\", smooth: {type: \"continuous\", roundness: 0.2}, color: \"pink\"},\n{from: 1, to: 3, label: \"Road 4\", smooth: {type: \"curvedCCW\", roundness: 0.3}},\n{from: 1, to: 3, label: \"Plane 2\", smooth: {type: \"curvedCW\", roundness: 0.3}, color: \"pink\"},"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#2", "metadata": {"Header 1": "Multirelational Graph", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#2", "page_content": "{from: 1, to: 3, label: \"Road 4\", smooth: {type: \"curvedCCW\", roundness: 0.3}},\n{from: 1, to: 3, label: \"Plane 2\", smooth: {type: \"curvedCW\", roundness: 0.3}, color: \"pink\"},\n{from: 2, to: 3, label: \"Road 6\", smooth: {type: \"continuous\", roundness: 0.2}}\n]\n} \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#3", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "Multiplex Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#3", "page_content": "A multireletional graph can also be split into layers of a multiplex graph. A multiplex graph is a graph where there\nare multiple layers of different edges but the same vertices. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#4", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "Signed Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#4", "page_content": "A signed graph is a graph in which each edge has a positive or negative sign. They can be used to represent a\nrelationship between two vertices. Because there are two possible edges between two vertices, a positive and a negative\none, it means that a signed graph is a [multirelational graph](#multirelational-graphs). \nFor example, a positive sign could mean that two people are allies and a negative\nsign could mean that they are enemies. \nexport const signedGraph = {\nnodes: [\n{id: 1, label: \"Bob\"},\n{id: 2, label: \"Alice\"},\n{id: 3, label: \"Michael\"},\n{id: 4, label: \"Urs\"},\n{id: 5, label: \"Karen\"}\n],\nedges: [\n{from: 1, to: 2, label: \"-\", color: \"red\"},\n{from: 1, to: 3, label: \"+\", color: \"green\"},\n{from: 2, to: 3, label: \"-\", color: \"red\"},\n{from: 2, to: 4, label: \"+\", color: \"green\"},\n{from: 2, to: 5, label: \"-\", color: \"red\"},\n{from: 3, to: 5, label: \"+\", color: \"green\"},\n]\n} \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#5", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "Signed Graphs", "Header 3": "Triads and Balance Theory", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#5", "page_content": "Triads are a set of three vertices in a signed graph where each pair of vertices is connected by an edge, i.e forming a\ntriangle. Triads are important in social network analysis as they can be used to determine the stability of a social\nnetwork. \nThe balance theory states that a social network is balanced if all triads within that network are balanced. For a triad\nto be balanced, the number of negative edges must be even (0 being even). This leads to the following four possible\ntriads: \n![balancedTriads](/compSci/balancedTriads.png)\n \nThe first and last one are the simplest as they are either all positive or all negative. The idea of the second one\nis that it is balanced because \"the enemy of my enemy is my friend\". The third one is a common scenario that leads to\nissues in social networks. For example, if Alice and Eve are allies and Bob and Eve are alies but Alice and\nBob are enemies, then this leads to issues in the social network if Eve wants to introduce Alice and Bob to each other."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#6", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "Signed Graphs", "Header 3": "Balanced Signed Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#6", "page_content": "From the above, we can see that a signed graph is balanced if all triads are balanced. Where a triade could also be\ndefined as a cycle of length 3, $C_3$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#7", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "N-Mode Networks", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#7", "page_content": "An n-mode network is a graph where there are multiple types of vertices and the edges can only connect vertices of\ndifferent types. So in other words the graphs vertices can be split into $n$ disjoint sets and the edges can only\nconnect vertices from different sets. A normal graph is a 1-mode network as there is only one type of vertex.\nA 2-mode network is a graph where there are two types of vertices and the edges can only connect vertices of\ndifferent types. \nFor example, if we have a graph containing people and movies and the edges represent whether a person has watched a\nmovie or not. \nexport const movieReviewGraph = {\nnodes: [\n{id: 1, label: \"Bob\"},\n{id: 2, label: \"Alice\"},\n{id: 3, label: \"Michael\"},\n{id: 4, label: \"Urs\"},\n{id: 5, label: \"Karen\"},\n{id: 6, label: \"Inception\", color: \"pink\"},\n{id: 7, label: \"Titanic\", color: \"pink\"},\n{id: 8, label: \"The Godfather\", color: \"pink\"},\n{id: 9, label: \"Pulp Fiction\", color: \"pink\"},\n{id: 10, label: \"The Dark Knight\", color: \"pink\"},\n],\nedges: [\n{from: 1, to: 6, label: \"4\"},\n{from: 1, to: 7, label: \"3\"},\n{from: 2, to: 8, label: \"5\"},\n{from: 2, to: 9, label: \"4\"},\n{from: 5, to: 8, label: \"2\"},\n{from: 3, to: 6, label: \"3\"},\n{from: 3, to: 7, label: \"5\"},\n{from: 4, to: 9, label: \"3\"},\n{from: 5, to: 10, label: \"4\"},\n{from: 1, to: 10, label: \"5\"}\n]\n} \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#8", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "N-Mode Networks", "Header 3": "Bipartite Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#8", "page_content": "A graph $G$ is bipartite if its vertices can be split into two disjoint sets $V_1$ and $V_2$ such that every edge in\n$G$ connects a vertex in $V_1$ to a vertex in $V_2$. So there are no edges between vertices in the same set! \nThis makes a 2-mode network a bipartite graph. \n \n#### Bipartite Network Projections \nThe idea of a bipartite network projection is to project a bipartite graph to a \"normal\" graph i.e. 1-mode network. This\ncan be done in a few ways. \n##### Simple Projection \nIn the simple bipartite network projection, we project the bipartite graph to a 1-mode network by connecting two vertices\nif they have a common neighbor of the type to be removed in the bipartite graph. \nFor example, if we have a bipartite graph with people and events and the edges represent whether a person has attended\nan event or not. We can then project this bipartite graph to a 1-mode network where the vertices are people, and they\nare connected if they have attended the same event. \nBy doing this we can quickly find people that have been to the same events and might have similar interests. So if we\nhave the below graph: \nexport const eventGraph = {\nnodes: [\n{id: 1, label: \"Concert\", color: \"pink\"},\n{id: 2, label: \"University Open Day\", color: \"pink\"},\n{id: 3, label: \"Birthday Party\", color: \"pink\"},\n{id: 4, label: \"Bob\"},\n{id: 5, label: \"Alice\"},\n{id: 6, label: \"Michael\"},\n{id: 7, label: \"Urs\"},\n{id: 8, label: \"Karen\"},\n{id: 9, label: \"John\"},\n{id: 10, label: \"Emma\"},\n{id: 11, label: \"David\"},\n{id: 12, label: \"Sophia\"},\n],\nedges: [\n{from: 4, to: 1},\n{from: 6, to: 1},\n{from: 7, to: 1},\n{from: 8, to: 1},\n{from: 10, to: 1},\n{from: 11, to: 1},\n{from: 12, to: 1},\n{from: 6, to: 2},\n{from: 7, to: 2},\n{from: 8, to: 2},\n{from: 9, to: 2},\n{from: 4, to: 3},\n{from: 5, to: 3},\n]\n} \n \nand perform a simple projection we get the following graph: \nexport const eventSimpleProjGraph = {\nnodes: ["}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#9", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "N-Mode Networks", "Header 3": "Bipartite Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#9", "page_content": "{from: 6, to: 2},\n{from: 7, to: 2},\n{from: 8, to: 2},\n{from: 9, to: 2},\n{from: 4, to: 3},\n{from: 5, to: 3},\n]\n} \n \nand perform a simple projection we get the following graph: \nexport const eventSimpleProjGraph = {\nnodes: [\n{id: 4, label: \"Bob\"},\n{id: 5, label: \"Alice\"},\n{id: 6, label: \"Michael\"},\n{id: 7, label: \"Urs\"},\n{id: 8, label: \"Karen\"},\n{id: 9, label: \"John\"},\n{id: 10, label: \"Emma\"},\n{id: 11, label: \"David\"},\n{id: 12, label: \"Sophia\"},\n],\nedges: [\n// Bob knows everyone except John\n{from: 4, to: 5},\n{from: 4, to: 6},\n{from: 4, to: 7},\n{from: 4, to: 8},\n{from: 4, to: 10},\n{from: 4, to: 11},\n{from: 4, to: 12},\n// Alice only knows Bob, covered above\n// John only knows Karen, Michael and Urs\n{from: 9, to: 6},\n{from: 9, to: 7},\n{from: 9, to: 8},\n// the rest all know each other\n{from: 6, to: 7},\n{from: 6, to: 8},\n{from: 6, to: 10},\n{from: 6, to: 11},\n{from: 6, to: 12},\n{from: 7, to: 8},\n{from: 7, to: 10},\n{from: 7, to: 11},\n{from: 7, to: 12},\n{from: 8, to: 10},\n{from: 8, to: 11},\n{from: 8, to: 12},\n{from: 10, to: 11},\n{from: 10, to: 12},\n{from: 11, to: 12},\n]\n} \n \n##### Weighted Projection \nThe problem with the simple projection is that it does not take into account how many edges two vertices have in common.\nFor example, if two people have been to the same event 10 times, they will be connected the same way as two people that\nhave only been to the same event once. To solve this, we can use a weighted projection. \nIn a weighted projection, we connect two vertices if they have a common neighbor of the second type (the one to be removed),\njust like in the simple projection, but the weight of the edge is the number of common neighbors. More formally: \n$$\nw_{ab} = \\sum_{k \\in V_2} d_k^a d_k^b\n$$ \n- Where $V_2$ is the set that contains the vertices of the second type, i.e. the one that will be projected away (in the\nexample above, the events)."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#10", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "N-Mode Networks", "Header 3": "Bipartite Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#10", "page_content": "$$\nw_{ab} = \\sum_{k \\in V_2} d_k^a d_k^b\n$$ \n- Where $V_2$ is the set that contains the vertices of the second type, i.e. the one that will be projected away (in the\nexample above, the events).\n- And where $d_k^a$ is 1 if $a$ and $k$ are connected and 0 otherwise and $d_k^b$ is 1 if $b$ and $k$ are connected. \nThe weighted projection could then also be normalized by dividing each weight by the maximum weight of the graph. This\nwould then give us a value between 0 and 1. \nexport const eventWeightedProjGraph = {\nnodes: [\n{id: 4, label: \"Bob\"},\n{id: 5, label: \"Alice\"},\n{id: 6, label: \"Michael\"},\n{id: 7, label: \"Urs\"},\n{id: 8, label: \"Karen\"},\n{id: 9, label: \"John\"},\n{id: 10, label: \"Emma\"},\n{id: 11, label: \"David\"},\n{id: 12, label: \"Sophia\"},\n],\nedges: [\n// Bob knows everyone except John\n{from: 4, to: 5, label: \"1\", length: 200},\n{from: 4, to: 6, label: \"1\", length: 200},\n{from: 4, to: 7, label: \"1\", length: 200},\n{from: 4, to: 8, label: \"1\", length: 200},\n{from: 4, to: 10, label: \"1\", length: 200},\n{from: 4, to: 11, label: \"1\", length: 200},\n{from: 4, to: 12, label: \"1\", length: 200},\n// Alice only knows Bob, covered above\n// John only knows Karen, Michael and Urs\n{from: 9, to: 6, label: \"1\", length: 200},\n{from: 9, to: 7, label: \"1\", length: 200},\n{from: 9, to: 8, label: \"1\", length: 200},\n// the rest all know each other\n{from: 6, to: 7, label: \"2\", length: 200, color: \"red\"},\n{from: 6, to: 8, label: \"2\", length: 200, color: \"red\"},\n{from: 6, to: 10, label: \"1\", length: 200},\n{from: 6, to: 11, label: \"1\", length: 200},\n{from: 6, to: 12, label: \"1\", length: 200},\n{from: 7, to: 8, label: \"2\", length: 200, color: \"red\"},\n{from: 7, to: 10, label: \"1\", length: 200},\n{from: 7, to: 11, label: \"1\", length: 200},\n{from: 7, to: 12, label: \"1\", length: 200},\n{from: 8, to: 10, label: \"1\", length: 200},\n{from: 8, to: 11, label: \"1\", length: 200},\n{from: 8, to: 12, label: \"1\", length: 200},\n{from: 10, to: 11, label: \"1\", length: 200},\n{from: 10, to: 12, label: \"1\", length: 200},\n{from: 11, to: 12, label: \"1\", length: 200},"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#11", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "N-Mode Networks", "Header 3": "Bipartite Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#11", "page_content": "{from: 8, to: 11, label: \"1\", length: 200},\n{from: 8, to: 12, label: \"1\", length: 200},\n{from: 10, to: 11, label: \"1\", length: 200},\n{from: 10, to: 12, label: \"1\", length: 200},\n{from: 11, to: 12, label: \"1\", length: 200},\n]\n} \n \nWe can clearly see that only Michael, Urs and Karen are the only ones that have been to the same event more than once\n(highlighted in red). \n##### Newmann-weighted Projection \nThis projection is also sometimes called \"collaboration weighted projection\" (no idea why). \nThe idea of this projection is to further build up on the weighted projection by also taking into account the degree of\nthe common neighbor, i.e. the number of edges connected to the common neighbor. To take this into account we can use the\nfollowing formula to calculate the weight of the edge between two vertices $a$ and $b$: \n$$\nw_{ab} = \\sum_{k \\in V_2} \\frac{d_k^a d_k^b}{\\text{deg}(k) - 1}\n$$ \n- Where $V_2$ is the set that contains the vertices of the second type, i.e. the one that will be projected away (in the\nexample above, the events).\n- And where $d_k^a$ is 1 if $a$ and $k$ are connected and 0 otherwise and $d_k^b$ is 1 if $b$ and $k$ are connected. \nThis projection can be valuable if we imagine the following scenario: \nWe have a graph of people and events. We have an\nevent like a concert where 5000 people attended, a birthday party where 15 people attended and an\nOpen day at a university where 100 people attended. We can assume that if two people the party and the open day,\nthey are more likely to have similar interests than if they both attended the concert, or we could simply state that it\nis more likely that they came in contact with each other at the party or open day than at the concert. \nSo if we project the same graph as above, we get the following graph: \nexport const eventNewmannProjGraph = {\nnodes: [\n{id: 4, label: \"Bob\"},\n{id: 5, label: \"Alice\"},\n{id: 6, label: \"Michael\"},\n{id: 7, label: \"Urs\"},\n{id: 8, label: \"Karen\"},\n{id: 9, label: \"John\"},"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#12", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "N-Mode Networks", "Header 3": "Bipartite Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#12", "page_content": "export const eventNewmannProjGraph = {\nnodes: [\n{id: 4, label: \"Bob\"},\n{id: 5, label: \"Alice\"},\n{id: 6, label: \"Michael\"},\n{id: 7, label: \"Urs\"},\n{id: 8, label: \"Karen\"},\n{id: 9, label: \"John\"},\n{id: 10, label: \"Emma\"},\n{id: 11, label: \"David\"},\n{id: 12, label: \"Sophia\"},\n],\nedges: [\n// Bob knows everyone except John\n{from: 4, to: 5, label: \"1\", length: 200},\n{from: 4, to: 6, label: \"0.17\", length: 200},\n{from: 4, to: 7, label: \"0.17\", length: 200},\n{from: 4, to: 8, label: \"0.17\", length: 200},\n{from: 4, to: 10, label: \"0.17\", length: 200},\n{from: 4, to: 11, label: \"0.17\", length: 200},\n{from: 4, to: 12, label: \"0.17\", length: 200},\n// Alice only knows Bob, covered above\n// John only knows Karen, Michael and Urs\n{from: 9, to: 6, label: \"0.33\", length: 200},\n{from: 9, to: 7, label: \"0.33\", length: 200},\n{from: 9, to: 8, label: \"0.33\", length: 200},\n// the rest all know each other\n{from: 6, to: 7, label: \"0.5\", length: 200},\n{from: 6, to: 8, label: \"0.5\", length: 200},\n{from: 6, to: 10, label: \"0.17\", length: 200},\n{from: 6, to: 11, label: \"0.17\", length: 200},\n{from: 6, to: 12, label: \"0.17\", length: 200},\n{from: 7, to: 8, label: \"0.5\", length: 200},\n{from: 7, to: 10, label: \"0.17\", length: 200},\n{from: 7, to: 11, label: \"0.17\", length: 200},\n{from: 7, to: 12, label: \"0.17\", length: 200},\n{from: 8, to: 10, label: \"0.17\", length: 200},\n{from: 8, to: 11, label: \"0.17\", length: 200},\n{from: 8, to: 12, label: \"0.17\", length: 200},\n{from: 10, to: 11, label: \"0.17\", length: 200},\n{from: 10, to: 12, label: \"0.17\", length: 200},\n{from: 11, to: 12, label: \"0.17\", length: 200},\n]\n} \n \nThe calculations are a bit more complicated than for the weighted projection, but nothing to complex: \n$$\n\\begin{align*}\n\\text{Alice to Bob} &= \\frac{1 \\cdot 1}{2-1} + \\frac{0 \\cdot 1}{7-1} + \\frac{0 \\cdot 0}{4-1} = 1 \\\\\n\\text{Karen to John} &= \\frac{0 \\cdot 0}{2-1} + \\frac{1 \\cdot 0}{7-1} + \\frac{1 \\cdot 1}{4-1} = 0.33 \\\\\n\\text{etc...}\n\\end{align*}\n$$ \n##### Overlap Weighted Projection"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#13", "metadata": {"Header 1": "Multirelational Graph", "Header 2": "N-Mode Networks", "Header 3": "Bipartite Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/multirelational.mdx#13", "page_content": "\\text{Karen to John} &= \\frac{0 \\cdot 0}{2-1} + \\frac{1 \\cdot 0}{7-1} + \\frac{1 \\cdot 1}{4-1} = 0.33 \\\\\n\\text{etc...}\n\\end{align*}\n$$ \n##### Overlap Weighted Projection \nThe overlap weighted projection is similar to the weighted projection, but instead of using the number of common\nneighbors, it uses jaccard similarity. So the weight of the edge between two vertices $a$ and $b$ is: \n$$\nw_{ab} = \\frac{N(a) \\cap N(b)}{N(a) \\cup N(b)}\n$$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#1", "metadata": {"Header 1": "Network Reduction", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#1", "page_content": "Often times when working with big networks such as social networks there is too much data to be able to visualize it all\nat once or analyze. To make it easier to work with the data, we can either create a specific view of the data like when\nworking with database tables, or we can use sampling to take samples of the data."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#2", "metadata": {"Header 1": "Network Reduction", "Header 2": "Views", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#2", "page_content": "To visualize the different views we will use the following network as an example, where the different colors represent\ndifferent departments in a company: \n- Blue: IT department\n- Red: HR department\n- Green: Management department \nexport const departmentGraph = {\nnodes: [\n{id: 1, label: \"Bob\", color: \"blue\"},\n{id: 2, label: \"Alice\", color: \"blue\"},\n{id: 3, label: \"Michael\", color: \"blue\"},\n{id: 4, label: \"Urs\", color: \"blue\"},\n{id: 5, label: \"Karen\", color: \"blue\"},\n{id: 6, label: \"David\", color: \"green\"},\n{id: 7, label: \"Emily\", color: \"green\"},\n{id: 8, label: \"Linda\", color: \"red\"},\n{id: 9, label: \"John\", color: \"red\"},\n],\nedges: [\n// Edges within the IT department (blue)\n{from: 1, to: 2},\n{from: 1, to: 4},\n{from: 1, to: 5},\n{from: 2, to: 5},\n{from: 5, to: 1},\n{from: 1, to: 3},\n{from: 3, to: 4},\n{from: 3, to: 5},\n// Edges within the HR department (green)\n{from: 6, to: 7},\n{from: 7, to: 6},\n// Edges within the Management department (red)\n{from: 8, to: 9},\n// Edges connecting different departments\n{from: 2, to: 6},\n{from: 5, to: 9},\n{from: 4, to: 8},\n],\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#3", "metadata": {"Header 1": "Network Reduction", "Header 2": "Views", "Header 3": "Local View", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#3", "page_content": "The local view focuses on a specific group of nodes and their connections. So the local view is a subset of the entire\nnetwork that has been selected based on some criteria. For example if we have a network of people in a company, we can\ncreate a local view of the network that only contains people that are in the same department to analyze how they\ncommunicate with each other. \nBelow you can see the local view of the IT department: \nexport const localGraph = {\nnodes: [\n{id: 1, label: \"Bob\", color: \"blue\"},\n{id: 2, label: \"Alice\", color: \"blue\"},\n{id: 3, label: \"Michael\", color: \"blue\"},\n{id: 4, label: \"Urs\", color: \"blue\"},\n{id: 5, label: \"Karen\", color: \"blue\"},\n],\nedges: [\n// Edges within the IT department (blue)\n{from: 1, to: 2},\n{from: 1, to: 4},\n{from: 1, to: 5},\n{from: 2, to: 5},\n{from: 5, to: 1},\n{from: 1, to: 3},\n{from: 3, to: 4},\n{from: 3, to: 5},\n],\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#4", "metadata": {"Header 1": "Network Reduction", "Header 2": "Views", "Header 3": "Global View", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#4", "page_content": "The global view allows for a general view of the entire network. Here we summarize nodes to a single node based on some\ncriteria. For example if we have a network of people in a company, we can create a global view of the network that\nsummarizes all the people in the same department to a single node. This allows us to see how the different departments\ncommunicate with each other. \nBelow you can see the global view of the network (whether it is a good thing that HR an Management don't communicate\ndirectly is up for debate). \nexport const globalGraph = {\nnodes: [\n{id: 1, label: \"IT\", color: \"blue\"},\n{id: 2, label: \"HR\", color: \"red\"},\n{id: 3, label: \"Management\", color: \"green\"},\n],\nedges: [\n{from: 1, to: 2},\n{from: 1, to: 3},\n],\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#5", "metadata": {"Header 1": "Network Reduction", "Header 2": "Views", "Header 3": "Context View", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#5", "page_content": "The context view is a combination of the local and global view. First we summarize the network to a global view. Then\nwe pick a node in the global view and expand it again. If we have our previous example of a network of people in a\ncompany, we can create a context view of the network that summarizes all the people in the same department to a single\nnode. Then we pick a department and expand it again to see how the people in that department communicate with the other\ndepartments. \nFrom the graph below we could assume that Alice is the team lead of the IT department, since she is the one that talks\nto management. \nexport const contextGraph = {\nnodes: [\n{id: 1, label: \"Bob\", color: \"blue\"},\n{id: 2, label: \"Alice\", color: \"blue\"},\n{id: 3, label: \"Michael\", color: \"blue\"},\n{id: 4, label: \"Urs\", color: \"blue\"},\n{id: 5, label: \"Karen\", color: \"blue\"},\n{id: 6, label: \"HR\", color: \"red\"},\n{id: 7, label: \"Management\", color: \"green\"},\n],\nedges: [\n// Edges within the IT department (blue)\n{from: 1, to: 2},\n{from: 1, to: 4},\n{from: 1, to: 5},\n{from: 2, to: 5},\n{from: 5, to: 1},\n{from: 1, to: 3},\n{from: 3, to: 4},\n{from: 3, to: 5},\n// Edges connecting different departments\n{from: 2, to: 7},\n{from: 5, to: 6},\n{from: 4, to: 6},\n],\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#6", "metadata": {"Header 1": "Network Reduction", "Header 2": "Views", "Header 3": "Ego View", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#6", "page_content": "Ego/node/focus view is a view of the network that is centered around a specific node. In this view, the selected node\nis the \"ego,\" and its immediate connections, i.e. its neighbours are analyzed. \nexport const egoGraph = {\nnodes: [\n{id: 1, label: \"Bob\", color: \"blue\"},\n{id: 2, label: \"Alice\", color: \"blue\"},\n{id: 3, label: \"Michael\", color: \"blue\"},\n{id: 5, label: \"Karen\", color: \"blue\"},\n{id: 9, label: \"John\", color: \"red\"},\n],\nedges: [\n{from: 1, to: 5},\n{from: 2, to: 5},\n{from: 5, to: 1},\n{from: 3, to: 5},\n{from: 5, to: 9},\n],\n}; \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#7", "metadata": {"Header 1": "Network Reduction", "Header 2": "Views", "Header 3": "Filtering Edges", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#7", "page_content": "Another common method is remove edges from the network based on some criteria. For example if we have a network of\npeople in a company, we can remove all edges that are not between people in the same department to analyze how each\ndepartment communicates within. \nOr if we had a network with weights on the edges, we could remove all edges that have a weight below a certain threshold. \nexport const filteredGraph = {\nnodes: [\n{id: 1, label: \"Bob\", color: \"blue\"},\n{id: 2, label: \"Alice\", color: \"blue\"},\n{id: 3, label: \"Michael\", color: \"blue\"},\n{id: 4, label: \"Urs\", color: \"blue\"},\n{id: 5, label: \"Karen\", color: \"blue\"},\n{id: 6, label: \"David\", color: \"green\"},\n{id: 7, label: \"Emily\", color: \"green\"},\n{id: 8, label: \"Linda\", color: \"red\"},\n{id: 9, label: \"John\", color: \"red\"},\n],\nedges: [\n// Edges within the IT department (blue)\n{from: 1, to: 2},\n{from: 1, to: 4},\n{from: 1, to: 5},\n{from: 2, to: 5},\n{from: 5, to: 1},\n{from: 1, to: 3},\n{from: 3, to: 4},\n{from: 3, to: 5},\n// Edges within the HR department (green)\n{from: 6, to: 7},\n{from: 7, to: 6},\n// Edges within the Management department (red)\n{from: 8, to: 9},\n],\n}; \n \n#### Inter and Intra-Edges \nA form of filtering edges is to reduce a network down to its inter or intra-edges. \nInter-edges can be defined as the edges that connect vertices between two different groups or communities and intra-edges\nconnect vertices within a group. \nSo if for example we have a graph containing people in a company, we can group them by their gender. Then we can for\nexample only look at the edges between same gendered people (intra-edges) or between different gendered people\n(inter-edges). \n\nAdd example\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#8", "metadata": {"Header 1": "Network Reduction", "Header 2": "Sampling", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/networkReduction.mdx#8", "page_content": "Sampling is the process of taking a subset of the data and working with that instead of the entire network. \n\nThis is probably more general and doesn't need to be in this section\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#1", "metadata": {"Header 1": "Node Embeddings", "Header 2": "DeepWalk", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#1", "page_content": "think of random walks being like sentences, and nodes being like words. We can then use the SkipGram model to learn embeddings for nodes. In other words\ngiven a random walk missing a node, we want to predict the missing node. \nThe goal is to have nodes that are close in the graph be close in the embedding space. \nIs it really skipgram not cbow? \nthe probabilities formulas dont make a lot of sense to me. \nuse random walks to get the context of a node without having to look at the entire graph. The random walk is unbiased,\ni.e. it is not biased towards any particular node, it chooes the next node uniformly at random. \nsimiliar context => similiar nodes\nsimiliar sentences => similiar words \nWhy use dot product instead of cosine similarity?\nhttps://developers.google.com/machine-learning/clustering/similarity/measuring-similarity \nencoder node to embeding\ndecoder embedding to similarity ??? whyyy wtf, i.e just simple dot product? \nEncoder can be just a lookup table, i.e. the embedding matrix. For every node we have a row or columnnn? in the embedding matrix. We then learn the embedding matrix.\nnot very scalable tho if we have a lot of nodes because matrix is V by D where V is the number of nodes and D is the embedding dimension. \nhow to defined the node similiarites? Are linked, shared neighbors, have similiar surroundings i.e context => random walks \n=> only learning the graph structure, not the node features. But unsupervised so we dont have labels which is good. Hence, the embeddings are also task independent. \nP(v|z_u) = probability of visiting node v given we start random walk at node u. This is our model precition.??? \n=> softmax turns a vector of k real numbers into a vector of k real numbers that sum to 1, i.e a probability distribution. Soft version of a max function. \ndont forget at the random walk you can also go back to where you came from. \nUsing the random walks is much more effiecient?? than using the entire graph. But isnt the lookup table still V by D? So how is it effiecient?"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#2", "metadata": {"Header 1": "Node Embeddings", "Header 2": "DeepWalk", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#2", "page_content": "dont forget at the random walk you can also go back to where you came from. \nUsing the random walks is much more effiecient?? than using the entire graph. But isnt the lookup table still V by D? So how is it effiecient? \nNR_u is the neighborhood of node u using the strategy R. So if we use a random walk of length 10, then NR_u is the set of nodes that are within 10 hops of node u. i.e multiset \nmaximize sum of log-likelihoods of the random walks?"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#3", "metadata": {"Header 1": "Node Embeddings", "Header 2": "Node2Vec", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#3", "page_content": "same idea but instead of using a random walk, we use a biased random walk. \ndevelop biased 2nd order random walk can swap between local and global search, i.e breadth first search and depth first search. \nhave two parameters p and q. p is the return parameter, i.e. how likely are you to go back to where you came from. q is the in-out parameter,\ni.e. how likely are you to go to a node that is close to you or far away from you, i.e ratio between breadth first search and depth first search. Depending on q\nwe decide to go further away, DFS from where we came from or stay at the same level, BFS. \n2nd order because it remembers where it came from. 1st order is just a random walk."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#4", "metadata": {"Header 1": "Node Embeddings", "Header 2": "Embedding entire graph", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/nodeEmbeddings.mdx#4", "page_content": "First idea just sum up or average the embeddings of the nodes. was used in 2016 for molecule classification. \nother idea: introduce virtual node to represnet graph or subgraph. virtual node is connected to all nodes that i want to embed. \n3rd idea is using anonymous walks. Instead of using the node labels, we use the node ids. So we dont know what the node is, we enumerate them. this way\nwe can use the same random walk for different walks. A -> B -> A and A -> C -> A are the same walk, 1 -> 2 -> 1? agnostic to the node labels. \ncan calcualte the number of anonymous walks for a given length/number of nodes. \nCould have a bag of walks, i.e. count the number of times a walk appears in the graph. with length 3 there are 5 possible anonymous walks so get a dimension of 5.\nCan be seen as a probability distribution over the walks. \nTo know how many walks we can use a formula with epsilon and delta. epsilon. Can also learn an embedding of the anonymous walks."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#1", "metadata": {"Header 1": "Shortest Path", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#1", "page_content": "Before we see how to find the shortest path from vertex $a$ to vertex $b$ we need to define a few things. \n$D(a,b)=$the length of the shortest path from vertex $a$ to vertex $b$. If no such path exists, the length is $\\infty$. \nIf the graph is unweighted the length of a path is the number of edges it takes to get from $a$ to $b$. If however the graph is weighted the length is the sum of the edge weights. \nSingle source shortest path (SSSP) are all the shortest paths from vertex $s$ to all other vertices."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#2", "metadata": {"Header 1": "Shortest Path", "Header 2": "For Unweighted Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#2", "page_content": "In an unweighted graph we can just use a BFS as this starts with all vertices with distance 1 then 2 etc. \n```java\nvoid BFS(Vertex s) {\nQueue> R = new LinkedList>();\ns.dist = 0;\nR.add(s);\nwhile(!R.isempty()) {\nVertex v = R.remove();\nfor(Vertex w : v.adjList) {\nif (w.dist == Integer.MAX_VALUE) {\nw.dist = v.dist + 1;\nR.add(w);\n}\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#3", "metadata": {"Header 1": "Shortest Path", "Header 2": "For weighted graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#3", "page_content": "For a weighted graph this is slightly trickier as the shortest path isn't necessarily the path with the least edges."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#4", "metadata": {"Header 1": "Shortest Path", "Header 2": "For weighted graphs", "Header 3": "Dijkstra's algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#4", "page_content": "1. Assign all the vertices the attributes \"finished\", \"Distance and „Via/Predecessor“. Initialize the distance of the root vertex as 0 and all others as $\\infty$.\n2. While there are unvisited nodes. So finished=false and distance $\\leq \\infty$.\n1. Choose the vertex $v$ with the smallest distance.\n2. Set $v.finished = true$\n3. For all vertices $w$ that have and edge between $v$ and $w$\n1. Set int d = v.dist + edge weight between $v$ and $w$.\n2. if(d < w.dist) w.dist = d; w.via = v; \nThe time complexity of this algorithm is as followed. For 1 While loop step 1 takes $O(n)$, step 2 takes $O(1)$, step 3 takes $O(outdeg(v))$ \nFor n While loops this the becomes $O(n^2 + m)$"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#5", "metadata": {"Header 1": "Shortest Path", "Header 2": "For weighted graphs", "Header 3": "Improvements", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/shortestPath.mdx#5", "page_content": "We could save the vertices that are not finished in a Set so we don't have to look through the entire table. This however doesn't have an effect on the time complexity. \nWe could save the vertices that are not finished in a Min-Heap. Init=$=(n)$ step 1 then becomes deleteMin() which is $O(log n)$ and step 3 becomes decreaseKey outdeg times O(log n). With these improvements our time complexity is O((n+m)log n)."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#1", "metadata": {"Header 1": "Statistics with Graphs", "Header 2": "Hypothesis Testing", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#1", "page_content": "The goal of a hypothesis test is to determine if there is a statistical relationship between two variables. To test\nif there is a relationship between two variables we check if the two variables have a correlation. \nWhen building a hypothesis test we have to define the null hypothesis often denoted as $H_0$ which is a test that\nchecks if there is no relationship between the two variables. We then also define an alternative hypothesis $H_1$\nwhich is a test that checks if there is a relationship between the two variables. \nThe null hypothesis is the hypothesis that we want to reject. If we reject the null hypothesis then we accept the\nalternative hypothesis. If we do not reject the null hypothesis then we do not accept the alternative hypothesis. \nHypothesis tests also have a significance level which is the probability of rejecting the null hypothesis when it is\ntrue (so a false positive). The significance level is usually denoted as $\\alpha$ and is usually set to 0.05 (5%)."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#2", "metadata": {"Header 1": "Statistics with Graphs", "Header 2": "Permutation Tests", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#2", "page_content": "When working with networks we use permutation tests to test our hypothesis because other tests assume that the data is\nindependent, which is not the case in networks for obvious reasons. \nIn a permutation test we start with the initial state of the network and calculate the statistic of interest, i.e. the\ncorrelation coefficient. We then randomly permute the network and calculate the statistic of interest again and keep\ntrack of how many times the statistic of interest is greater than the initial statistic of interest. We then repeat this\nprocess many times and compare the statistic of interest to the distribution of the permuted statistics. \nThe relative frequency of the statistic of interest being greater than the initial statistic of interest is the p-value.\nThe p-value is the probability of observing a statistic as extreme as the one observed given the null hypothesis is true. \n \nThe initial Pearson Correlation between the degree centrality and the \"happiness\" attribute of the vertices is 0.4. \nWe then permute the network 10000 times and calculate the Pearson Correlation between the degree centrality and the\n\"happiness\" attribute of the vertices for each permutation. We then count the number of times the Pearson Correlation\nwas greater or equal to 0.4 and divide it by the number of permutations. \nSay we counted 200 times that the Pearson Correlation was greater or equal to 0.4 then the p-value would be\n$200/10000 = 0.02$. This means that there is a 2% chance of observing a Pearson Correlation as extreme as 0.4\ngiven the null hypothesis is true, i.e. we say there isn't a correlation, but there is a 2% chance of observing a\ncorrelation as extreme as 0.4, so a 2% chance of a false positive. \nIf the p-value is less than the significance level, i.e. $p < \\alpha$, then we reject the null hypothesis and accept the\nalternative hypothesis, i.e. we say there is a correlation. If the p-value is greater than the significance level, i.e."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#3", "metadata": {"Header 1": "Statistics with Graphs", "Header 2": "Permutation Tests", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#3", "page_content": "If the p-value is less than the significance level, i.e. $p < \\alpha$, then we reject the null hypothesis and accept the\nalternative hypothesis, i.e. we say there is a correlation. If the p-value is greater than the significance level, i.e.\n$p > \\alpha$, then we accept the null hypothesis and do not accept the alternative hypothesis, i.e. we say there\nis no correlation. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#4", "metadata": {"Header 1": "Statistics with Graphs", "Header 2": "Permutation Tests", "Header 3": "Monadic Tests", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#4", "page_content": "A monadic test is a test for a relationship between vertex attributes, for example we hypothesize that rich people\nare well connected. \n \nAs always we set the significance level to 5%, $\\alpha = 0.05$. We then calculate the initial Pearson Correlation\nbetween the two variables. \n$$\n\\begin{align*}\n\\text{degree} &= [0.1,0.2,0.3,0.5,0.6] \\\\\n\\text{wealth} &= [100, 400, 300, 900, 300]\n\\end{align*}\n$$ \nWe get the initial value $r=0.522$, so we would be inclined to say that there is a moderate correlation. \nWe then permute the network by shuffling both the degree and wealth attributes and calculate the Pearson Correlation again. \n$$\n\\begin{align*}\n\\text{degree} &= [0.5,0.3,0.1,0.6,0.2] \\\\\n\\text{wealth} &= [300, 300, 100, 400, 900]\n\\end{align*}\n$$ \nand this time get $r=-0.04$ which is a weak correlation. We repeat this process 10000 times and count the number of\ntimes $r \\leq 0.522$ and divide it by the number of permutations. Say we counted 400 times that $r \\leq 0.522$ then\nthe p-value would be $400/10000 = 0.0.4$. This means because the 4% is lower than the significance level of 5% we\nreject the null hypothesis and accept the alternative hypothesis, i.e. we say there is a correlation.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#5", "metadata": {"Header 1": "Statistics with Graphs", "Header 2": "Permutation Tests", "Header 3": "Dyadic Tests", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#5", "page_content": "Dyadic tests are tests between relationships, i.e. the edges of the network unlike the monadic tests which are tests\nbetween vertex attributes. To be able to do a dyadic test we need to have a multirelational network, i.e. a network\nwith multiple types of edges. This is then easily split into a multiplex graph, i.e. we have separate graph for each\ntype of relationship. \n\nFor example we want to test if students that study together also drink together. Then we have a network showing who\nstudies with who and a network showing who drinks with who. To then calculate the Pearson Correlation between the two\nnetworks we take their adjacency matrix, remove the diagonal elements because we are interested in relationships\nbetween different people, and then calculate the Pearson Correlation between the two flattend matrices. \n$$\n\\begin{align*}\n\\text{study} &= \\begin{bmatrix}\n0 & 10 & 4 & 3 \\\\\n10 & 0 & 2 & 0 \\\\\n4 & 2 & 0 & 2 \\\\\n3 & 0 & 2 & 0 \\\\\n\\end{bmatrix} &= [10,4,3,10,2,0,4,2,2,3,0,2] \\\\\n\\text{drink} &= \\begin{bmatrix}\n0 & 3 & 2 & 0 \\\\\n3 & 0 & 1 & 0 \\\\\n2 & 1 & 0 & 1 \\\\\n0 & 0 & 1 & 0 \\\\\n\\end{bmatrix} &= [3,2,0,3,1,0,2,1,1,0,0,1]\n\\end{align*}\n$$ \nBased on these vectors we can then do our hypothesis tests.\n \n#### QAP - Quadratic Assignment Procedure \nWhen doing dyadic tests we can't do our normal permutation tests and just randomly shuffle the vectors because we\nwould lose the structure of the network. To solve this we use the Quadratic Assignment Procedure (QAP) which is a\npermutation test for dyadic tests. \nWhen using QAP instead of randomly shuffling we swap an entire row and column with another. \n \nWe start with the initial adjacency matrix of the study network and then swap the first row and column with the\nsecond row and column. \n$$\n\\begin{align*}\n\\text{study} &= \\begin{bmatrix}\n0 & 10 & 4 & 3 \\\\\n10 & 0 & 2 & 0 \\\\\n4 & 2 & 0 & 2 \\\\\n3 & 0 & 2 & 0 \\\\\n\\end{bmatrix} \\Rightarrow \\begin{bmatrix}\n0 & 10 & 2 & 0 \\\\\n10 & 0 & 4 & 3 \\\\\n2 & 4 & 0 & 2 \\\\"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#6", "metadata": {"Header 1": "Statistics with Graphs", "Header 2": "Permutation Tests", "Header 3": "Dyadic Tests", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#6", "page_content": "second row and column. \n$$\n\\begin{align*}\n\\text{study} &= \\begin{bmatrix}\n0 & 10 & 4 & 3 \\\\\n10 & 0 & 2 & 0 \\\\\n4 & 2 & 0 & 2 \\\\\n3 & 0 & 2 & 0 \\\\\n\\end{bmatrix} \\Rightarrow \\begin{bmatrix}\n0 & 10 & 2 & 0 \\\\\n10 & 0 & 4 & 3 \\\\\n2 & 4 & 0 & 2 \\\\\n0 & 3 & 2 & 0 \\\\\n\\end{bmatrix}\n\\end{align*}\n$$\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#7", "metadata": {"Header 1": "Statistics with Graphs", "Header 2": "Permutation Tests", "Header 3": "Mixed Tests", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/statistics.mdx#7", "page_content": "But what if we want to test for a correlation between a vertex attribute and its edges? For example, we want to\ntest if people of the same gender communicate more often with each other then with of the opposite gender. \nTo do this we call it a mixed test but in the end it is just a dyadic test. We create a new network where the edges\nsomehow signify the vertex attributes. For example, we can create a network where the people are only connected if\nthey have the same gender. \nWe then just simply do our dyadic test. \nAnother possible test could be to see if the age of a person is correlated with the way they communicate. To do this\nWe can create a fully connected network where the weight of the edges is the age difference between the two people.\nAnd just like that we can do a dyadic test."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#1", "metadata": {"Header 1": "Storing Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#1", "page_content": "Because I am a computer scientist I don't just care about using graphs but also about how to store them. There are\nmultiple ways of storing graphs. Depending on the type of graph and the requirements a certain storage method might\nbe preferred as it might be more efficient in terms of memory or time complexity."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#2", "metadata": {"Header 1": "Storing Graphs", "Header 2": "Adjacency Matrix", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#2", "page_content": "An adjacency matrix is the most common and the most simple way of storing graphs. If we have a graph with $n$ vertices,\nwe create an adjacency matrix with dimensions of $n \\times n$. As the name suggests this matrix stores the adjacency of\nvertices i.e. the relationship between the vertices i.e. the edges. \nBelow you can see an example of a very simple adjacency matrix of an undirected unweighted graph. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#3", "metadata": {"Header 1": "Storing Graphs", "Header 2": "Adjacency Matrix", "Header 3": "Weighted and Unweighted Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#3", "page_content": "If we have a weighted graph where the weights are integer values we can create the following matrix: \n```java\nint[][] G = new int[n][n]\n``` \nThe weight of the edge from vertex $x$ to vertex $y$ would then be stored at `G[y][x]`. \nIf there is no edge between 2 vertices then there are multiple ways to indicate this. The simplest way would be to set\nthe value to 0 (which it already is at initialization) another common approach is to set the weight to\n`Integer.MAX_VALUE`. \nThe last possibility but the worst in space complexity is to use a `null` value. This is only possible if we use an\nobject array instead of a primitive array. This would mean that instead of using `int[][]` we use `Integer[][]`.\nThis would however mean that we would end up using a lot more memory as you can read more about\n[here](https://stackoverflow.com/a/65568047/10994912). \nIf the graph is unweighted we can use the same int array and just store all edge weights as 1 or 0. We could however\nalso use a 2D boolean array which interestingly does use less space in Java than an int array. In Java, a normal boolean\nvariable uses 32 bits like an int. However, in an array each boolean value only takes up 8 bits because that is what the\nCPU likes to use internally. This means that a boolean array uses 4 times less space than an int array."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#4", "metadata": {"Header 1": "Storing Graphs", "Header 2": "Adjacency Matrix", "Header 3": "Directed and Undirected Graphs", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#4", "page_content": "If the graph is undirected then we commonly set the value at `G[y][x]` and `G[x][y]` to the same value. This means that\nmathematically $G = G^T$ i.e. the matrix is symmetric along the diagonal, and that we could get away with only storing\nthe upper or lower triangle of the matrix, which would half the memory usage. However, this would make the code more\ncomplicated. \nIf the graph is directed then we only set the value at `G[y][x]` and not at `G[x][y]`. So $G \\eq G^T$ is no longer\nguaranteed. \nThe biggest problem with storing graphs with an adjacency matrix is that its space complexity is $O(n^2)$. What makes\nthis worse is that a lot of the space is wasted as in most cases there are only a few edges between vertices making it\na sparse matrix. Very rarely do we need to store a [complete graph](generalDefinition#complete-graphs). \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#5", "metadata": {"Header 1": "Storing Graphs", "Header 2": "Adjacency Matrix", "Header 3": "Implementation", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#5", "page_content": "As mentioned although the adjacency matrix isn't the most efficient way of storing graphs it is the most common and\nvery simple to implement. Below you can see a simple implementation for an undirected unweighted graph. \n```java\npublic class UndirectedUnweightedGraph {\nfinal boolean[][] adjMatrix;\nfinal int n;\n\npublic GraphI(int numNodes) {\nif (numNodes < 1) throw new IllegalArgumentException();\nthis.adjMatrix = new boolean[numNodes][numNodes];\nthis.n = numNodes;\n}\n\npublic boolean addEdge(int x, int y) {\nif (0 <= x && x < n && 0 <= y && y < n) {\nif (adjMatrix[y][x]) return false; // already set\nadjMatrix[y][x] = adjMatrix[x][y] = true;\nreturn true;\n}\nthrow new IndexOutOfBoundsException();\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#6", "metadata": {"Header 1": "Storing Graphs", "Header 2": "Edge List", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#6", "page_content": "Another simple but less common way of storing graphs is to just store a list of the edges. The edges could have the\nfollowing structure: \n```java\nclass Edge {\nint from, to, weight;\n}\n``` \nIf it is an unweighted graph the weight attribute could just be omitted and if it is an undirected graph you can either\nhave two entries for each edge or handle from and to the same way when processing. \nThe advantage of this solution is that it only uses $O(m)$ memory with $m$ being the number of edges. The disadvantage\nof this storage solution is that you can not quickly find out how many vertices are in the graph and what they are. This\ncould however be solved by just adding another list containing all the vertices. This solution would then be very\nsimilar to the formal definition of a graph $G=(V, E)$. We would then have a memory usage of $O(n+m)$ with $n$ being the\nnumber of vertices and $m$ the number of edges. \n\nImplement this\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#7", "metadata": {"Header 1": "Storing Graphs", "Header 2": "Adjacency Lists", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#7", "page_content": "An adjacency list stores for each vertex has a list of its edges. An adjacency list can just be a simple array but the\nlist storing the edges is most commonly a linked list due to the storage and length being dynamic. If the graph is\nundirected you can again either handle it by just storing an edge in one of the list, for example always in the source\nvertex, or you can also store it additionally in the destination vertexes list. This structure uses just like the edge\ntable $O(n+m)$ memory with $n$ being the number of vertices and $m$ the number of edges."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#8", "metadata": {"Header 1": "Storing Graphs", "Header 2": "Adjacency Lists", "Header 3": "Implementation", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/storingGraphs.mdx#8", "page_content": "\nIs this really correct and they way I want it?\n \n```java\npublic class UnweightedGraph {\nprivate static class Vertex {\nK data;\nint indegree, deg = 0;\nboolean visited;\nList> adjList = new LinkedList>();\n\nVertex(K value) {\ndata = value;\n}\n\nboolean addEdgeTo(Vertex to) {\nreturn (adjList.contains(to)) ? false : adjList.add(to);\n}\n}\n\nprivate Map> vertices;\nprivate int nOfEdges = 0;\n\npublic UnweightedGraph() {\nthis(false);\n}\n\npublic UnweightedGraph(boolean directed) {\nsuper(directed);\nvertices = new HashMap>();\n}\n\npublic UnweightedGraph(UnweightedGraph orig) { // copy constructor\nthis(orig.isDirected());\nfor (K k: orig.vertices.keySet()) {\naddVertex(k);\n}\nfor (Vertex v: orig.vertices.values()) {\nfor (Vertex w: v.adjList) {\naddEdge(v.data, w.data);\n}\n}\n}\n\npublic boolean addVertex(K vertex) {\nif (vertex != null && !vertices.containsKey(vertex)) {\nvertices.put(vertex, new Vertex(vertex));\nreturn true;\n} else {\nreturn false;\n}\n}\n\npublic boolean addEdge(K from, K to) {\nVertex vf = vertices.get(from);\nVertex vt = vertices.get(to);\nif (vf != null && vt != null && vf.addEdgeTo(vt)) {\nvt.indegree++;\nif (!isDirected()) {\nvt.addEdgeTo(vf);\nvf.indegree++;\n}\nnOfEdges++;\nreturn true;\n} else {\nreturn false;\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#1", "metadata": {"Header 1": "Topological Sort/Ordering", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#1", "page_content": "The goal of a topological sort is given a list of items with dependencies, (ie. item 5 must be completed before item 3, etc.) to produce an ordering of the items that satisfies the given constraints. In order for the problem to be solvable, there can not be a cyclic set of constraints. (We can't have that item 5 must be completed before item 3, item 3 must be completed before item 7, and item 7 must be completed before item 5, since that would be an impossible set of constraints to satisfy.) Meaning we can model this problem with a directed unweighted acyclic graph. When all the vertices are topologically ordered in a row all the edges go from left to right."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#2", "metadata": {"Header 1": "Topological Sort/Ordering", "Header 2": "Kahn's Algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#2", "page_content": "Kahn's algorithm describes a way in which we can find a topological order. In pseudocode the algorithm goes like this \n```c\nL ← Empty list that will contain the sorted elements\nS ← Set of all nodes with no incoming edge so indeg(v)=0\n\nwhile S is not empty do\nremove a node n from S\nadd n to L\nfor each node m with an edge e from n to m do\nremove edge e from the graph\nif m has no other incoming edges then\ninsert m into S\n!!!!!OR!!!!!\nfor each node m with an edge e from n to m do\nreduce indeg(m) by one\nif indeg(m)==0\ninsert m into S"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#3", "metadata": {"Header 1": "Topological Sort/Ordering", "Header 2": "Kahn's Algorithm", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#3", "page_content": "if graph still has edges then\nreturn error (graph has at least one cycle)\nelse\nreturn L (a topologically sorted order)\n``` \nDepending on the order that the nodes n are removed from set S, a different solution is created. A possible solution with an Adjacent matrix could look something like this. \n```java\npublic int[] topsort(){\nint[] indeg = new int[n]; // calculate indegree\nfor (int i=0; i> queue = new LinkedList>();\nint counter = 0;\nfor (Vertex v : vertices.values()) {\nv.deg = v.indegree; // set indegree of each vertex\nif (v.deg == 0) queue.addFirst(v); // start set\n}\nwhile (!queue.isEmpty()) {\nVertex v = queue.removeLast();\nsb.append(v.data + \" \");\ncounter++; // count processed vertices\nfor (Vertex w : v.adjList)\nif (--w.deg == 0) // decrease indegree of adjecent\nqueue.addFirst(w); // Add to S\n}\nif (counter != vertices.size()) {\nsb.replace(0, sb.length(), \"Cycle found\");\n}\n} else {\nsb.append(\"Graph is not directed, TopSort not possible.\");\n}\nSystem.out.println(sb);\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#4", "metadata": {"Header 1": "Topological Sort/Ordering", "Header 2": "Kahn's Algorithm", "Header 3": "Time Complexity", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/graphsNetworks/topologicalSortOrdering.mdx#4", "page_content": "If we have $n$ vertices with $m$ edges and have stored that graph in an Adjacency list. \n- To calculate all vertices indeg takes $O(n+m)$.\n- Adding all vertices with indeg 0 to S takes worst case $O(n)$.\n- Each edge is followed once to reduce the indeg of a vertex $O(m)$.\n- Each node is added and removed once from S so $O(n)$ \nThis then leads to a time complexity of $O(n+m)$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/bubbleSort.mdx#1", "metadata": {"Header 1": "Bubble Sort", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/bubbleSort.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/bubbleSort.mdx#1", "page_content": "When starting to study algorithms and data structures, the first algorithm that is usually taught is the Bubble Sort.\nIt is a simple sorting algorithm that is easy to understand and implement. However, it is not very efficient and is not\nused in practice. \nThe algorithm works by repeatedly swapping neighboring elements if they are in the wrong order, i.e. if we are sorting\nan array of numbers in ascending order we check if the current number is greater than the next number, if so we swap\nthem, if not we go to the next. \nThe algorithm is called Bubble Sort because with each iteration the largest element in the array \"bubbles up\" to the end\nof the array due to this swapping. \n \n```python\ndef bubble_sort(arr):\nfor i in range(len(arr)):\nfor j in range(len(arr) - 1):\nif arr[j] > arr[j + 1]:\narr[j], arr[j + 1] = arr[j + 1], arr[j]\n``` \nFrom the code above we can see that the algorithm has a time complexity of $O(n^2)$ and a space complexity of $O(1)$ as\nit works in place. \nThe above implementation can be slightly by considering the fact that after each iteration the largest element at the\nend of the array is already in the correct position, so we can reduce the number of iterations by 1 each time. \n```python\ndef bubble_sort(arr):\nfor i in range(len(arr)):\nfor j in range(len(arr) - i - 1):\nif arr[j] > arr[j + 1]:\narr[j], arr[j + 1] = arr[j + 1], arr[j]\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx#1", "metadata": {"Header 1": "Counting Sort", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx#1", "page_content": "The counting sort can be defined in a dumb way or a smart way that can be implemented very efficiently. \nThe counting algorithm is mainly intended as a sub-routine for the radix sort and unlucky many other sorting algorithms,\nit does not use comparisons to sort the elements. The counting sort is generally defined to sort a list of integers\nin a known range, most commonly 0 to k, i.e. $0 \\leq A[i] \\leq k$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx#2", "metadata": {"Header 1": "Counting Sort", "Header 2": "The Naive Approach", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx#2", "page_content": "The naive way to define the counting sort as follows: \n\nThe native nextra steps is shit.\n \n \n\n**Create the Counting Array** \nWe create a so-called counting array, an array of size k+1 and then iterate over the input array and increment the value\nat the index of the input array in the counting array. \n \n\n\n\n**Generate the Sorted Array** \nThanks to the counting array we then know how many times each value occurs in the input array and can therefore just\ngenerate the sorted array according to the counting array. Technically this means the dumb counting sort is an in-place\nsorting algorithm, because we don't actually need to allocate a new array to store the sorted array, we can just\noverwrite the input array. \nThis is a very dumb way to do it, but it works and actually runs in $O(n+k)$ time, where $n$ is the\nlength of the input array and $k$ is the range of the input array.\n\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx#3", "metadata": {"Header 1": "Counting Sort", "Header 2": "The Smart Approach", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/countingSort.mdx#3", "page_content": "The smart way to define the counting sort is to use the counting array as a way to store the index of a value in\nthe sorted array. This implementation of the counting sort is a bit more complicated but still runs in $O(n+k)$ time. \nHowever, the advantage of this implementation is that it is stable, i.e. it preserves the relative order of equal\nelements. The other advantage is that it is actually faster than the dumb implementation in practice because it\ncan be parallelized using the scan and scatter pattern. \n\n\n**Create the Counting Array** \nThe first step is the same as in the dumb implementation, we create a counting array. \n\n\n**Compute the Cumulative Sum** \nWe then iterate over the counting array and compute the cumulative sum of the counts. The cumulative sum is\nalso often called the prefix or scan sum. This part that can be parallelized using the scan pattern. \n \n\n\n**Scatter the Input Array** \nThe magical thing about the counting sort is that the values in the counting array are actually the indexes of the\nvalues in the sorted array. So we can just iterate over the input array and use the value at the index of the input\narray as the index in the sorted array and then increment the value at the index of the input array in the counting\narray. This part can be parallelized using the scatter pattern (not sure how you can handle the incrementing if the same\nvalue occurs multiple times in the input array). \nBecause of the scatter pattern, the counting sort becomes a stable and in-place sorting algorithm that runs in\n$O(n+k)$. \n \n\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/insertionSort.mdx#1", "metadata": {"Header 1": "Insertion Sort", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/insertionSort.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/insertionSort.mdx#1", "page_content": "The insertion sort algorithm is a bit more complex than the bubble sort or selection sort but still relatively simple.\nYou can think of it in the following way: \nImagine you have a deck of cards, and you want to sort them in ascending order.\nYou assume that the first card is already sorted. Then, you take the second card and compare it to the first card, if\nit is smaller, you place it in front of it. Now the first two cards are sorted. Then, you take the third card and\ncompare it to the ones before it, one after another until you find the right place to insert it, hence the name\ninsertion sort as you repeatedly insert cards into the sorted part of the deck. This way the sorted part of the array slowly grows from left to right. \n \n```python\ndef insertion_sort(arr):\nfor i in range(1, len(arr)):\nkey = arr[i]\nj = i - 1\n# Compare key with each element to the left of it until a smaller one is found\nwhile j >= 0 and key < arr[j]:\narr[j + 1] = arr[j]\nj -= 1\narr[j + 1] = key\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/selectionSort.mdx#1", "metadata": {"Header 1": "Selection Sort", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/selectionSort.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/selectionSort.mdx#1", "page_content": "The selection sort algorithm is probably one of the most intuitive sorting algorithms and the one I actually use when\nsorting a deck of cards. The algorithm works by iterating through the list, finding the smallest element, and moving it\nto the front of the array. Then, the algorithm repeats this process for the rest of the array, moving the next smallest\nelement to the second position, and so on until the array is sorted. \nThe algorithm is called a selection sort because it works by repeatedly selecting the smallest remaining element. \n \n```python\ndef selection_sort(arr):\nfor i in range(len(arr)):\nmin_index = i\n# Find the index of the smallest element in the unsorted portion of the array.\nfor j in range(i + 1, len(arr)):\nif arr[j] < arr[min_index]:\nmin_index = j\narr[i], arr[min_index] = arr[min_index], arr[i]\n``` \nJust like the bubble sort, the selection sort is an in-place sorting algorithm and has a time complexity of $O(n^2)$ and\na space complexity of $O(1)$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/stableSorting.mdx#1", "metadata": {"Header 1": "Stable Sorting Algorithms", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/stableSorting.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/sorting/stableSorting.mdx#1", "page_content": "Whether a sorting algorithm is stable or not is quite simple to determine. If the algorithm preserves the relative order\nof equal elements, it is stable. If it does not, it is not stable. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#1", "metadata": {"Header 1": "Binary Search Trees", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#1", "page_content": "A binary search tree is a [binary tree](./binaryTrees) where each node has a key. *key(v)=Key of the node v*. The important thing here is that in the left subtree of a node there are only nodes with keys that are smaller than the key of the node. In the right subtree accordingly only nodes with a key that are the same or larger. \nInterestingly when traversing the tree in-order we can see that the keys ascend. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#2", "metadata": {"Header 1": "Binary Search Trees", "Header 3": "Operations", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#2", "page_content": "#### Insert \nWhen inserting a node you need to find the ideal place for insertion. This is started by comparing it with the root and seeing whether it belongs in the left or right half. This is then repeated recursively repeated until the insertion point is found. \n#### Search \nWhen searching for a specific key we follow a similar process as with when inserting. By comparing the key with the roots key and then carrying on down the tree. If we don't just want the first node but all nodes with this key, once the first node is found we carry on down the right subtree until it is empty. \n#### Remove \nWhen removing a node we distinguish between 3 different cases. \n##### Leaf \nWe search for the node and then simply remove it, nothing complicated. \n \n##### Node with 1 child \nWe search for the node to be removed and remove it. The child of the removed node then takes its place. Also nothing complicated. \n \n##### Node with 2 children \nWe search for the node to be removed. We then look for the symmetrical (inorder) successor, which is the node in the right subtree that is the furthest left. We then replace the to be removed node with the symmetrical successor. Lastly we delete the symmetrical successor at its original position which is either case 1 or 2. \n \n```java\nprivate Node remove(Node node, K key){\nif (node == null) {\nreturn null;\n}\nelse {\nint c = key.compareTo(node.key);\nif (c < 0) {\nnode.left = remove(node.left, key);\n}\nelse if (c > 0) {\nnode.right = remove(node.right, key);\n}\nelse {\nif (node.left == null) {\nnode = node.right;\nnodeCount--;\n}\nelse if (node.right == null) {\nnode = node.left;\nnodeCount--;\n}\nelse {\nNode succ = symSucc(node.right);\nsucc.right = remove(node.right, succ.key);\nsucc.left = node.left;\nnode = succ;\n}\n}\nreturn node;\n}\n}"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#3", "metadata": {"Header 1": "Binary Search Trees", "Header 3": "Operations", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#3", "page_content": "private Node symSucc(Node node){\nNode succ = node;\nwhile (succ.left != null) {\nsucc = succ.left;\n}\nreturn succ;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#4", "metadata": {"Header 1": "Binary Search Trees", "Header 3": "Time complexities", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binarySearchTrees.mdx#4", "page_content": "The time complexities of the operations depend on the height of the tree. In the worst case all operations take O(n), when the tree has become like a list. In the best case all operations O(log n), this is when the tree is complete(excluding the last level). From this we can see it is important to keep the height as small as possible to have the best time complexities."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#1", "metadata": {"Header 1": "Binary Trees", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#1", "page_content": "A binary tree is a tree with the order of 2. Meaning that a node is either a leaf or has left and/or right child. \n \nBy adding empty leaves we can make sure the binary tree is always filled which can make certain operations and algorithms easier. We add the empty leaves by first adding 2 empty leaves to all leaves which make the leaves to inner nodes. Then all inner nodes that only have one child receive an empty leaf. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#2", "metadata": {"Header 1": "Binary Trees", "Header 2": "Traversal orders", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#2", "page_content": "There are multiple ways to traverse a tree each one giving a different result."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#3", "metadata": {"Header 1": "Binary Trees", "Header 2": "Traversal orders", "Header 3": "Pre-order, NLR", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#3", "page_content": "In this order a node visited before its left and right subtree is traversed. \n1. Visit the current node.\n2. Recursively traverse the current node's left subtree.\n3. Recursively traverse the current node's right subtree. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#4", "metadata": {"Header 1": "Binary Trees", "Header 2": "Traversal orders", "Header 3": "Post-order, LRN", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#4", "page_content": "In this order a node is visited after its left and right subtree has been traversed. \n1. Recursively traverse the current node's left subtree.\n2. Recursively traverse the current node's right subtree.\n3. Visit the current node. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#5", "metadata": {"Header 1": "Binary Trees", "Header 2": "Traversal orders", "Header 3": "In-order, LNR", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/binaryTrees.mdx#5", "page_content": "In this order a node is visited in between the traversal of its left and right subtree. \n1. Recursively traverse the current node's left subtree.\n2. Visit the current node.\n3. Recursively traverse the current node's right subtree. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#1", "metadata": {"Header 1": "B-Trees", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#1", "page_content": "The goal of a B-tree is to not have to load an entire tree into memory. Only a bit by bit can be loaded in for processing. The order of a B-tree means something slightly different then with a normal tree. \nIf a B-tree has the order of $n$ then it has to meet the following conditions: \n1. Each node has a maximum of $2n$ elements.\n2. Each node, expect for the root, has at least n elements.\n3. Each node, that is not a leaf, has $m+1$ successors, where $m$ is the number of elements the node has.\n4. All leaf nodes are on the same level.\n5. All nodes have $m$ keys + reference to its data that are sorted in ascending orders of there keys and $m+1$ references to its successors. \nBelow we can see a B-tree with the order of 2. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#2", "metadata": {"Header 1": "B-Trees", "Header 2": "Operations", "Header 3": "Search", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#2", "page_content": "To find a key we start of in a node and look at its elements. We then increase our counter i until one of the elements either is the key or is larger then the key. If the element is the key we get the corresponding data. If not and the node is a leaf we could not find the key. Otherwise we continue recursively by getting the node that is on the right side of the element. \n```java\nE find(Node node, K key) {\nint i = 0;\nwhile (i < node.m && key.compareTo(node.keys[i]) > 0) {\ni++;\n}\nif (i < node.m && key.equals(node.keys[i])) {\nreturn dataBlock(node.data[i]);\n}\nif (node.isLeaf()) {\nreturn null;\n}\nNode child = diskRead(node.successor[i]);\nreturn find(child, key);\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#3", "metadata": {"Header 1": "B-Trees", "Header 2": "Operations", "Header 3": "Insert", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#3", "page_content": "When inserting we always want to insert in a leaf node. Here 3 scenarios can happen. \n#### Case 1 \nThe leaf isn't full and we can just simply add it. \n \n#### Case 2a \nThe leaf is full so we have to split up the leaf. We split the leaf by taking the middle element and put it into the parent and create a new node with the elements that were to the right of the middle element. \n \n \n#### Case 2b \nIt can happen that when splitting the node and putting the middle element in the parent the parent is already full so the you have to perform another split. This process can repeat all the way to the root until a new root is created. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#4", "metadata": {"Header 1": "B-Trees", "Header 2": "Operations", "Header 3": "Remove", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#4", "page_content": "When removing an element we define 2 scenarios. Either the element we want to delete is in a leaf or in a inner node. \n#### From leaf \nIf after the removal of the element the amount of elements in the node is still larger or equal to n the order of the B-tree, so $m\\geq n$ nothing needs to be done. \n \nHowever if $m < n$ then the tree no longer meets the conditions we defined at the beginning. To restore these conditions we have to options. Either to borrow or combine. \n \n##### Borrow \nWe can use this operation when for example the right node doesn't have enough elements and the left node has more then $n$ elements or the other way around then we can simply do almost like a left or right rotation. \n \n##### Borrow Variant \nSince we have anyway loaded the neighbouring node into the RAM we might as well use this situation to balance out the 2 nodes so that they have equal amounts of elements. This can be done by doing multiple rotations in a row. \n \n##### Combine \nWe can use this operation when we can't borrow from a neighbouring node. So when no neighbours have more then $n$ elements. In this case we combine the 2 nodes together. \n \n##### Combine Variant \nHere we combine not 2 but 3 nodes together to make 2 nodes so that the resulting nodes are more then half full. The advantage of doing this is that the next insert or remove can be done very easily. \n \n#### From inner node \nHere just like in the binary tree we replace it with its symmetric successor which then always leads to a remove in a leaf."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#5", "metadata": {"Header 1": "B-Trees", "Header 2": "Time Complexities", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/bTrees.mdx#5", "page_content": "If n is the order and N the total number of elements then we have the following time complexities \n- Search: worst case from root to leaf so O(log(n) * N)\n- Insert: Find the place O(log(n) *N), insertion is in most cases constant but can be O(log(n)* N).\n- Remove: Find the place O(log(n) *N), removal is in most cases constant but can be O(log(n)* N)."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#1", "metadata": {"Header 1": "General Definition", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#1", "page_content": "Trees have **nodes** that hold the data and **edges** which connect the nodes. An **empty tree** obviously has no nodes and therefore no data. \n\nLink this up with graphs, rooted trees and forests somehow. Monte Carlo would also be cool.\n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#2", "metadata": {"Header 1": "General Definition", "Header 2": "Node Relationships", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#2", "page_content": "In trees there are a few relationships between nodes that are important to know: \n- A Tree like in our real world has a root. The **root** is the highest node in the tree. Each node is also a root of its own **subtree**.\n- A **child** node is a node that is connected to a node above it, the so called **parent**. A **Sibling** is a node that shares the same parent.\n- A **leaf** node has no children as it is hanging alone at the bottom of a subtree. An **inner node** however has at least 1 child. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#3", "metadata": {"Header 1": "General Definition", "Header 2": "Order", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#3", "page_content": "The **Order of a tree** is the max amount of children a tree is aloud to have. In the above picture we don't know the order of the tree but we can say that it is $\\geq 3$."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#4", "metadata": {"Header 1": "General Definition", "Header 2": "Degree", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#4", "page_content": "The **degree of a node** is the amount of children a specific node has. Often this is denoted as *deg(v)=Number of children of node*. So in the above tree *deg(Q)=2*."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#5", "metadata": {"Header 1": "General Definition", "Header 2": "Path", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#5", "page_content": "A **path** is a combination of edges between 2 Nodes. The length of the path is the amount of nodes visited whilst traversing from the start node to the end node. The amount includes the start and the end node. So in the above tree the length of the path from R to H is 4."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#6", "metadata": {"Header 1": "General Definition", "Header 2": "Height", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#6", "page_content": "The **height of a tree** is the length of the longest path from the root to a leaf. In the above tree the height is 5."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#7", "metadata": {"Header 1": "General Definition", "Header 2": "Depth", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#7", "page_content": "The **depth of a node** is the amount of nodes on the path to the root. Often this is denoted as *depth(v)=Number of nodes on path to root*. So in the above tree *depth(L)=4*."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#8", "metadata": {"Header 1": "General Definition", "Header 2": "Level", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#8", "page_content": "A **Level** is a grouping of all nodes with the same depth."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#9", "metadata": {"Header 1": "General Definition", "Header 2": "Full and Complete Trees", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/generalDefinition.mdx#9", "page_content": "A **Full tree** is a tree where all inner nodes have the maximum amount of Nodes according to the order. In the image below both trees are full with an order of 2. \nA **Complete tree** is a tree where each level has the maximum amount of nodes according to the order. In the image below only the left tree is complete with an order of 2. \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#1", "metadata": {"Header 1": "Heaps", "Header 2": "Priority queue", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#1", "page_content": "In a lot of cases we want a queue that is however influenced by a priority. The priority is the key of an element and must be comparable. The elements in the queue are then sorted by this priority resulting in HIFO, highest priority in first out. There are then 3 operations that can be performed on this priority queue. Elements can be added, we can look at the element with the highest priority and remove it. Depending there are then 2 ways in which we can define the key with the highest priority, either it is the largest or smallest key, which then lead to the following definitions: \nA Minimum Priority Queue where peek()=min() and remove()=removeMin() or a Maximum Priority Queue where peek()=max() annd remove()=removeMax()."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#2", "metadata": {"Header 1": "Heaps", "Header 2": "Priority queue", "Header 3": "Priorities", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#2", "page_content": "A priority can contain any element that has a comparable key. \nElement can either have a natural or predefined key and order. For example if the element is a number then the number can be used as its key. However if the element is a clubmember then we might use the amount of years he has been a member as the key. It gets tricky with role hierarchies for example in the military. So we can define the Interface like the following \n```java\npublic interface MinPriorityQueue> {\nboolean add(K element);\nK min();\nK removeMin();\nint size();\n}\n``` \nimportantly here is the extends Comparable \nWe might not have predefined priorities but priorities we assign when adding."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#3", "metadata": {"Header 1": "Heaps", "Header 2": "Min-Heap", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#3", "page_content": "A Min-Heap is a complete binary tree with the exception of the last level. On the last level it is filled from left to right. Importantly each nodes key is smaller or equal to that of its children. \n \nJust with this definition we can already easily implement 2 of the methods in the interface we defined both with O(1). The min() method is just the root and the size() method is done just like in all other collections with a counter."}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#4", "metadata": {"Header 1": "Heaps", "Header 2": "Min-Heap", "Header 3": "Add", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#4", "page_content": "We start by adding the new element to the furthest left free space on the lowest level or if it is already full left on a new level. To then correct the order the element slowly wanders up the tree by swapping with its parents until it is larger then its parent or is in the root. This process of wandering up the tree we call **sift up**. O(log n) = O(1)+O(log n) \n"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#5", "metadata": {"Header 1": "Heaps", "Header 2": "Min-Heap", "Header 3": "RemoveMin", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#5", "page_content": "We already know that the min element is the root so we can remove it. We then replace it with the furthest right element on the last level and let it sink down the tree, meaning we swap it with its smaller child until it is smaller then both of its children or is a leaf. This process of sinking down the tree we call **sift down**.\nO(log n) = O(1)+O(log n)"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#6", "metadata": {"Header 1": "Heaps", "Header 2": "Min-Heap", "Header 3": "Array Representation", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#6", "page_content": "We can also represent a Min-Heap as an array. \n \nWe can then see the following relationships for a node with the index $i$. These are all int operations so we ignore decimal points when deviding. \n| | Root at index 1 | Root at index 0 |\n| --------------------- | ------------------ | ------------------ |\n| Parent Node of i | i/2 | (i-1)/2 |\n| Left child of i | 2i | 2i+1 |\n| Right child of i | 2i + 1 | 2i+2 |\n| Indexes of all leaves | size/2+1 till size | size/2till size -1 |"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#7", "metadata": {"Header 1": "Heaps", "Header 2": "Min-Heap", "Header 3": "Building a heap from a filled array", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#7", "page_content": "The first idea is we want to add one element after another from front to back, so in the top down. We however notice that we save a bit of space but it takes O(n log n). So we need a second idea. \n#### Floyd's heap construction \nHere instead of a lot of elements having to be sifted up we let the elements sift down which then leads to an algorithm that is O(n). \n```java\nHeap(HeapNode[] elems) {\nthis.heap = elems;\nthis.size = elems.length;\nfor(int i=size/2; i>= 0; i--){\nsiftDown(i);\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#8", "metadata": {"Header 1": "Heaps", "Header 2": "Min-Heap", "Header 3": "Sorting using a heap (Heapsort)", "path": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx", "id": "../pages/digitalGarden/cs/algorithmsDataStructures/trees/heaps.mdx#8", "page_content": "This is the so called heapsort. We take an array and by using floyds heap construction, construct a heap. We can then just for the size of the array removeMin which leads to a time complexity of O(n log n) since the removeMin is O(log n). However we do need to have O(n) additional space. However this can be improved to O(1) if we construct the heap directly in the input array and add the removeMin and add it to the back of the input array so we at the end of the array we have sorted array. \n"}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#1", "metadata": {"Header 1": "Cloud Practitioner", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#1", "page_content": "There are sample questions somewhere\nHands on is free to 12 euro (depending on service different payment schemes but everything pay as you go per use/time)\nAws console is where you actually do stuff\nMake sure account is activated and choose free support plan"}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#2", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "What is cloud computing", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#2", "page_content": "Client, server via network with ip addresses. Just like postboxes.\nBuild up of a server: cpur, ram data and strucutred data (databases)\nNetwork our of cables routers switch and servers incl. dns \nMainting own data center(server cluster) pay rent, power, scaling is limited, people etc. solution: cloud \nCloud criteria from cloud computing lecture:\n- on demand\n- Broad network access\n- Ressource sharing/pooling multi-tenancy\n- Rapid elasticity and scaliblity\n- Measured service pay as you go \nTypes of clouds:\n- private (rackspace? Proxmox?)\n- public aws azure, gc\n- Hybrid some local some cloud. \nWhen using a cloud u Trade capital expense capex for operational expense opex \nDont need to guess capacity, save on having to maintain a data center, high availability and fault tolerance \nTypes of cloud computing: \n- IaaS, provide infrastructure highest level of flexibility (EC2, digital ocean)\n- PaaS, dont need to manage infrastructure just run app, elastic beanstalk\n- SaaS, just work no managment. Face rocgnition on aws or gmail \nShow different levels the 3 + on premise of what has to be managed. \nAws has map on infrastructure.aws:\n- regions\n- availability zones\n- data centers\n- edge locations/points of presence \nRegions have 3-6 zones with each 1 or more datacenters. Each zone has redudant power, network and connectivity. Availability zones are isolated from each other for disasters. And are connected with high bandwidth, low-latency network within region \nA bit unclear of meaning of edge locations? \nHow to choose a region:\n- compliance, data governance and legal requirements\n- Proximity, reduce latency\n- Availiability, not all services are available in all regions\n- Pricing \nSearch at the top is very useful as also has docs and tutorials \nSome services are global and have the same view no matter which region (can be seen in the top right) \nCan list services by region (link todo)"}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#3", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "What is cloud computing", "Header 3": "Shared Responsibility Model", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#3", "page_content": "Shared responsibility diagram shows who is responsible for what (u or aws) how u configure services are ur responsibility if u configure shit security ur fault. Aws is responsible for security on software and hardware level."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#4", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "IAM - Identity and access management", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#4", "page_content": "Directly vs inline? Is it the same? \nThe first service to look at is IAM, short for Identity and Access Management. This is the service that allows you to manage users and their level of access to the AWS Management Console.\nBecause it is one of the most important services and is used to control access to all other services, it is a **global service**, meaning it is available in all regions. \nTo remember the acronym, think of yourself saying \"I am\" the person who is going to manage the users and their permissions. \nWhen you first create an AWS account, you have created a **root account**. This account has complete access to all AWS services and can therfore also rack up a huge bill.\nIt is not recommended to use the root account for everyday tasks, or to share it with others. Instead, you should create use the **least privilege principle** and create an\n**IAM user** for yourself and give it the necessary permissions. This is also a best practice for security reasons. \nIn AWS permissions are managed using **policies**. A policy is a document that defines permissions. It is written in JSON format and consists of a version, an identifier and a statement. \nFor example, the following policy `AdminstratorcAccess` allows all actions on all resources: \n```json\n{\n\"Version\": \"2012-10-17\",\n\"Statement\": [\n{\n\"Effect\": \"Allow\",\n\"Action\": \"*\",\n\"Resource\": \"*\"\n}\n]\n}\n``` \nThe `Effect` can be either `Allow` or `Deny`. The `Action` is the action that is allowed or denied on a `Resource`.\nActions are API calls that allow you to interact with AWS services such as read, write delete etc. The `Resource` is the resource that the policy applies to such as an S3 bucket or an EC2 instance.\nThe `*` is a wildcard that matches all actions or resources. \nWhen creating a new user, you can assign them permissions by attaching policies to them. You can also create **groups** and assign policies to the group and then add users to the group."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#5", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "IAM - Identity and access management", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#5", "page_content": "The `*` is a wildcard that matches all actions or resources. \nWhen creating a new user, you can assign them permissions by attaching policies to them. You can also create **groups** and assign policies to the group and then add users to the group.\nYou can also create an alias for the sign-in link, which can be useful if you are working with multiple accounts, for example for different projects. \nIn AWS users, groups, roles etc. are also reffered to as **identities**."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#6", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "IAM - Identity and access management", "Header 3": "Types of Policies", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#6", "page_content": "For a clear overview I suggest you look at the [AWS documentation](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_managed-vs-inline.html) on the topic. \nHowever, in short there are three types of policies: \n- **Managed Policies**: These are standalone policies that you can attach to multiple identities. They are maintained by AWS and are the recommended way to assign permissions.\nThis is for example the `AdministratorAccess` policy that allows full access to all AWS services or the `IAMFullAccess` policy that allows full access to IAM.\n- **Customer Managed Policies**: These are policies that you create and manage yourself. You can attach them to multiple identities. This is when you define a custom policy\nvia JSON or the visual editor. These policies are stored in your account and are reusable.\n- **Inline Policies**: These are basically customer managed policies that are embedded directly into a single identity and are then also deleted when the identity is deleted."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#7", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "IAM - Identity and access management", "Header 3": "Keeping Users Secure", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#7", "page_content": "Can be found under Account settings and then password policy. \nWhen creating users for others there are some password policies that you can set to ensure that the passwords are secure:\n- **Minimum password length**\n- **Require specific character types**\n- **Allow users to change their own password**\n- **Require password change on first login**\n- **Password expiration** and **password reuse prevention** \nThe most secure way to access the AWS Management Console is by using **Multi-Factor Authentication (MFA)**. \nMFA = something you know (password) and something you have (token on physical device) \nVirtual MFA devices like Google Authenticator or Authy\nUniversal 2nd Factor (U2F) security key like YubiKey\nHardware Key Fob MFA devices \nTo activate MFA its at the top right of the console under Security Credentials. You can then choose to activate MFA for the root account.\nFor IAM users you can activate the same MFA or enforce it for all users."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#8", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "IAM - Identity and access management", "Header 3": "How to access AWS", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#8", "page_content": "- **AWS Management Console**: Web-based user interface that you can use to access and manage AWS services.\n- **AWS Command Line Interface (CLI)**: Command line tool that allows you to control multiple AWS services from the command line and automate them through scripts.\n- **AWS Software Development Kits (SDKs)**: Libraries or APIs that allow you to interact with AWS services from your preferred programming language. \nVia access keys do not share. Can be used for CLI or SDKs. Can be rotated.\nTo configure CLI you can use `aws configure` and enter your access key id , secret access key, region and output format (just enter for default). \naws iam list-users to list users very useful \nWhat is cloudshell? is basically CLI in the management console, recommended to use it for CLI commands over local installation.\ncan alos download and upload files to it, seems very useful. Not available in all regions."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#9", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "IAM - Identity and access management", "Header 3": "Roles", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#9", "page_content": "In AWS some services might want to perform actions. For example an EC2 instance might want to access an S3 bucket. To do this you\ncan use **roles** which are similar to users but are assigned to AWS services."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#10", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "IAM - Identity and access management", "Header 3": "Security Tools", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#10", "page_content": "Can use **Credentials Report** to see all users and the status of their credentials. Account-level\nCan use **IAM Access Advisor** shows the services that a user has accessed and the last time they accessed them to align permissions with actual usage (least privilege principle). User-level"}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#11", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Billing and Cost Management", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#11", "page_content": "We can view usage and billing information in the **Billing and Cost Management Dashboard**. Here we can also set up budgets and alerts to notify us when we are exceeding our budget. \"charges by service\" is useful to see what is costing the most.\nZero cost budget or montly budget can be set up etc."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#12", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#12", "page_content": "One of the key compute services in AWS is EC2, short for Elastic Compute Cloud. \nIaaS \nMainly consist of:\n- EC2, renting virtual machines\n- EBS, virtual block storage for EC2 (what is block storage?)\n- ELB, load balancer\n- ASG, auto scaling group \nConfiguration:\n- OS, linux, mac or windows\n- cpu and ram\n- storage, network attached storage with EBS or EFS or hardware storage with instance store\n- network card for speed, security group for firewall, public ip, private ip, dns\n- bootstrap script for configuration, ec2 user data for startup script, only runs once when instance is launched (restart?) installing updates, softeare\n- ec2 user data script runs as root \nshow example of ec2 instances \nto ssh into ec2 create a key pair and use ssh -i key.pem ec2-user@ip\nin the security group allow port 22 for ssh and 80 for http as will start a Web server\nebs volumes can be configured like deletion on termination, encryption, snapshotting etc. \nat the bottom of the advanced section of the ec2 instance creation there is a user data section where you can add a\nbash script that will run on startup. This can be used to install software, configure the instance etc. Why is it called user data? \nAfter starting and stopping an instance the public ip will change. To have a static ip you can use an elastic ip. \nThere are different types of instances like general purpose (balanced), compute optimized, memory optimized and storage optimized. \nm5.2xlarge,\nm = instance class\n5 = generation\n2xlarge = size within the instance class \nec2instances.info is a useful website to compare instances"}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#13", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "Security Groups", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#13", "page_content": "Fundamental firewall rules for your EC2 instances. They control the inbound and outbound traffic to your instances. \nregualte:\n- access to ports\n- authorise ip ranges, also if ipv4 or ipv6\n- control of inbound network (ingress) and outbound network (egress) traffic \nBy default all inbound traffic is blocked and all outbound traffic is allowed. \nmany to many relationship between security groups and instances. \nIf a timeout occurs when trying to connect to an instance it is likely a security group issue.\nfor example if you are trying to connect to an instance via ssh and the security group does not allow port 22. \nIf you get connection refused it is likely an issue with the instance itself. \nCan authorize specific security groups to allow traffic between instances in the same security group without specifying ip addresses. \nMost important ports:\n- 22 for ssh\n- 21 for ftp, unencrypted\n- 22 SFTP, encrypted ftp over ssh\n- 80 for http unencrypted\n- 443 for https encrypted\n- 3389 for rdp remote desktop protocol, windows instances \n0.0.0.0/0 means all ip addresses \nec2 instance connect is a new feature that allows you to connect to an instance without manually using ssh. \nGo into the instance and click connect, then connect with ec2 instance connect. Will temporaily create a key pair and connect to the instance.\nPort 22 must be open in the security group. \nNever cofnigure of enter iam credentials into an instance. Use roles instead. \npurchsing options:\n- on demand, pay as you go\n- reserved instances, 1-3 years, cheaper up to 70%, upfront, partial upfront or no upfront\n- convertible reserved instances, change instance type, same but less discount\n- savings plans, commit to a certain amount of usage 10 per hour for 1 year, cheaper than on demand excess is on-demand\n- spot instances, bid for unused capacity, cheap but can be terminated at any time, less reliable up to 90% cheaper, not suitable for critical jobs"}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#14", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "Security Groups", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#14", "page_content": "- spot instances, bid for unused capacity, cheap but can be terminated at any time, less reliable up to 90% cheaper, not suitable for critical jobs\n- dedicated hosts, physical server for you, expensive, complicance requirements or licensing like per core etc.\n- dedicated instances, no other customers on the same hardware, expensive\n- capacity reservations, reserve capacity for specific instance type in a specific availability zone, expensive, are assured capacity, billed on-demand even if don't use \nthere is a table for between dedicated host vs dedicated instance"}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#15", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "EBS - Elastic Block Store", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#15", "page_content": "Block storage vs object storage? \n\"network usb stick\" \nstorage options for ec2 instances.\nnetwork drive that can be attached to an ec2 instance. persist data even if instance is terminated.\nNot a physical drive, more latency than local storage.\ncan only be attached to one instance at a time and are bound to a specific availability zone. \nto move ebs between AZs you can create a snapshot and then create a new volume from the snapshot in the new AZ.\nFixed size but can be increased over time. \ncan set \"delete on termination\" to false to keep the volume when the instance is terminated.\nby default the root volume is deleted on termination and additional volumes are kept. \nEBS Snapshots are backups of your EBS volumes. recommend to first detach the volume before creating a snapshot (make it clean).\nusing snapshots you can create new volumes, move volumes between AZs or regions, share snapshots with other accounts etc.\nsnapshots can be copied to other regions for disaster recovery. \ncan be moved to archive which is cheaper but takes longer to restore 24 hours to 72 hours. \nyou can setup rules for when snapshots are deleted to go into a recycle bin for 7 days before being permanently deleted. \"retention rules\""}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#16", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "AMI - Amazon Machine Image", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#16", "page_content": "powers ec2 instances. like software configuration, os, application server, applications etc.\nThis allows for faster boot times and consistency across instances as everything preconfigured and packeged. \na public ami is for example the amazon linux 2 ami. you can also create your own ami from an existing instance. think of docker images.\nCreate own ami or use marketplace ami can potentially save time but also cost and security concerns. \nAMIs are built from a ec2 instance. Ideally you should stop the instance before creating the AMI to ensure that the file system is in a consistent state.\nThis creation process will also create a snapshot of the EBS volumes attached to the instance."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#17", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "EC2 Image Builder", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#17", "page_content": "automate the creation of VMs or container images.\ni.e automate the creation, maintain, validate and test AMIs. \nStarts a builder ec2 instance where you can install software, configure etc. and then create an image from that.\nThen an instance is launched from the image and the image is tested with tests that you define. if they pass then published. \nThis can be setup on a schedule to ensure that the images are always up to date."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#18", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "EC2 Instance Store", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#18", "page_content": "certain instance types come with instance store volumes which are physically attached, so better performance but data is lost when the instance is stopped or terminated (ephemeral storage).\ngood for like caches, buffers, scratch data etc.\nrisk of data loss, so not recommended for critical data."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#19", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "EFS - Elastic File System", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#19", "page_content": "Managed network file system that can be shared across multiple ec2 instances. Think of it as a network usb stick like the AD.\nhighly available, scalable, expensive, pay for what you use no capacity planning.\ninstances can be across different AZs. \nEFS-IA is infrequent access storage class that is cheaper but has a retrieval fee. for data that is not accessed often.\nwill automatically move files that are not accessed often to the infrequent access storage class based on a lifecycle policy.\nonce you access the file it will be moved back to the standard storage class."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#20", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "Ec2 - elastic compute cloud", "Header 3": "FSx", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#20", "page_content": "use 3rd party high performance file systems like windows file server or lustre for high performance computing.\nnative and supported for windows via smb and integrated with Microsoft Active Directory for user authentication. \nLustre is Linux and cluster file system for high performance computing. for ML and analytics workloads."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#21", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "ELS and ASG", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#21", "page_content": "ELS is an elastic load balancer that distributes incoming traffic across multiple targets like ec2 instances.\nASG is an auto scaling group that automatically adjusts the number of ec2 instances in response to demand. \nScalability = ability to handle increased load, either:\n- vertically, for call center instead of junior operator hire senior operator, common for databases\n- horizontally (elasticity), for call center just get more operators, implies a distributed system, common for web servers\n- elasticity = ability to automatically increase or decrease capacity based on demand, auto-scaling pay per use\n- agility = agile development, fast deployment, fast changes, fast feedback \nhigh availability = ability to stay up and running, run in at least 2 availability zones to survivie a disaster. \nboth require load balancing and auto scaling. high availability also wants multi AZ mode for both. \nload balancer forward internet traffic to multiple ec2 instances downstream. load balancer can be internal or external, i.e internet facing or not.\nsingle point of acces, spread load, high availability, fault tolerance, health checks for downstream instances. \nEBS = managed load balancer, meaning aws manages the load balancer for you makes sure is up and running, scales automatically, no need to worry about maintenance.\nCan setup custom load balancer which is cheaper but more work. \n4 types of load balancers:\n- Application Load Balancer (ALB), http and https grpc, layer 7, http routing features, static dns/url\n- Network Load Balancer (NLB), TCP, high performance, layer 4, static ip through elastic ip\n- Gateway Load Balancer, layer 3, GENEVE protocol on ip packets, route traffic to firewalls managed by ec2 instances for intrusion detection etc. traffic is checked and insprected via GWLB.\n- Classic Load Balancer, retired 2023, layer 4 and 7 \ntraffic is routed to \"target groups\" which depend on the protocol and port. can have multiple target groups per load balancer. and each target group can have multiple targets."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#22", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "ELS and ASG", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#22", "page_content": "- Classic Load Balancer, retired 2023, layer 4 and 7 \ntraffic is routed to \"target groups\" which depend on the protocol and port. can have multiple target groups per load balancer. and each target group can have multiple targets. \nASG for auto scaling, like shopping during the day but not at night. can scale out (add instances) or scale in (remove instances) based on demand.\nCan setup min/desired/max, they can also be registerd to a target group. the ASG will also replace unhealthy instances because of health checks. \nlaunch templates are instructions for the ASG on how to launch instances. can be used to launch instances with specific configurations like ami, instance type, key pair etc. \nscaling strategies:\n- manual, not recommended, update the desired capacity manually\n- dynamic, either simple or step scaling, based on cloudwatch alarms based on a trigger like cpu usage > 70% for 5 minutes add or under 30% for 5 minutes remove instances??\ntarget tracking scaling, scale out or in to keep a metric at a specific value like cpu usage at 70%.\nlastly scheduled scaling, scale out or in based on a schedule like every saturday scale out for sports betting website.\n- predictive scaling, machine learning to predict future demand based on historical data. will provision instances in advance for easy pattern recognition."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#23", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "S3 - Simple Storage Service", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#23", "page_content": "many use cases like backup and storage, disaster recovery, archiving, hybrid cloud storage, data lakes, static websites etc. \nstore objects (files) in buckets (folders). buckets must have a globally unique name across all regions and accounts. buckets is a global service\nbut are stored in a region. \nobjects have a key (full path) prefix + object name. Seems like a folder but doesn't actually exist.\nobject values are the content. max size of an object is 5TB and can store any type of file like images, videos, backups, logs etc.\nWhen uploading more then 5GB must use \"multipart upload\", X/5GB parts.\nobject metadata, list of text key value pairs like content type from system or user.\nobject tags unicode key value pairs by user for organization, cost allocation etc. max 10 tags per object.\nversion id for versioning if versioning is enabled. can be used to restore previous versions of an object."}}
+{"id": "../pages/digitalGarden/cs/aws/practitioner.mdx#24", "metadata": {"Header 1": "Cloud Practitioner", "Header 2": "S3 - Simple Storage Service", "Header 3": "Security", "path": "../pages/digitalGarden/cs/aws/practitioner.mdx", "id": "../pages/digitalGarden/cs/aws/practitioner.mdx#24", "page_content": "- user based, IAM policies, sets which api calls a user can make\n- resource based, bucket policies, bucket wide rules such as public access, cross account access etc.\n- Object Access Control List (ACL), fine grained control\n- Bucket Access Control List (ACL), less common \nobjects can be encrypted with encryption keys \nBucket policies, that allows anyone to read, i.e public read access. principal are the users that are allowed to do the action, therefor * is everyone. \n```json\n{\n\"Version\": \"2012-10-17\",\n\"Statement\": [\n{\n\"Sid\": \"PublicRead\",\n\"Effect\": \"Allow\",\n\"Principal\": \"*\",\n\"Action\": \"s3:GetObject\",\n\"Resource\": \"arn:aws:s3:::mybucketname/*\"\n}\n]\n}\n``` \nYou can set block \"all public access\" to true to prevent public access to the bucket even if there is a bucket policy that allows it."}}
+{"id": "../pages/digitalGarden/cs/c/arrays.mdx#1", "metadata": {"Header 1": "Arrays", "path": "../pages/digitalGarden/cs/c/arrays.mdx", "id": "../pages/digitalGarden/cs/c/arrays.mdx#1", "page_content": "In C an array is a variable that can store multiple values of the same data type. When declaring it you must define how many values the array can hold. You can then access particular elements by using indexes which start at 0. In C array out of bounds, meaning the index is not in the range of $0-length-1$, can not be checked by the compiler and therefore does not throw an error. An out of bounds exception can cause the program to crash or unexpected behavior. Arrays are initialized with the default value of that type but you can also specify specific values when initializing. \n```c\n#include \n\nvoid printIntArray(int arr[], int length){\nprintf(\"%d\", arr[0]);\nfor(int i = 1; i < length; i++) {\nprintf(\", %d\", arr[i]);\n}\nprintf(\"\\n\");\n}\n\nint main()\n{\nint empty[1];\nint marks[5] = {19, 10, 8, 17, 9};\nint otherMarks[] = {1,2,3}; // length is inferred\nint moreMarks[5] = {[2]=10, [4]=40}; // all others are 0\n\nprintIntArray(empty, 1);\nprintIntArray(marks, 5);\nprintIntArray(otherMarks, 3);\nprintIntArray(moreMarks, 5);\n\nreturn 0;\n}\n``` \n```bash filename=\"Output\"\n-410826608\n19, 10, 8, 17, 9\n1, 2, 3\n0, 0, 10, 0, 40\n```"}}
+{"id": "../pages/digitalGarden/cs/c/arrays.mdx#2", "metadata": {"Header 1": "Arrays", "Header 2": "Multidimensional arrays", "path": "../pages/digitalGarden/cs/c/arrays.mdx", "id": "../pages/digitalGarden/cs/c/arrays.mdx#2", "page_content": "In C you can also create multidimensional arrays so which are arrays of arrays. Just as in other language 2D can be visualized as a table. You can also go further like 3D etc. but this can quickly get very confusing."}}
+{"id": "../pages/digitalGarden/cs/c/arrays.mdx#3", "metadata": {"Header 1": "Arrays", "Header 2": "Pointer arithmetic", "path": "../pages/digitalGarden/cs/c/arrays.mdx", "id": "../pages/digitalGarden/cs/c/arrays.mdx#3", "page_content": "Interestingly the name of an array is also a pointer to the first element of an array which we can make use of with a concept called pointer arithmetic to iterate through the array. \n```c\n#include \n\nvoid printIntArray(int* arr, int length) {\nint *arr_end = arr + length;\nfor(int* ptr = arr; ptr < arr_end; ptr++){\nprintf(\"%p\\t%d\\n\", (void*)ptr, *ptr);\n}\n}\n\nint main()\n{\nint marks[] = {19, 10, 8, 17, 9};\nint* ptr = marks; // same as &marks[0]\nprintIntArray(ptr, 5);\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#1", "metadata": {"Header 1": "Control Flow", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#1", "page_content": "These work just as in many other languages so will not go into further detail."}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#2", "metadata": {"Header 1": "Control Flow", "Header 3": "If/Else", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#2", "page_content": "```c\nif (test expression1) {\n// statement(s)\n}\nelse if (test expression2) {\n// statement(s)\n}\nelse {\n// statement(s)\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#3", "metadata": {"Header 1": "Control Flow", "Header 3": "For", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#3", "page_content": "```c\nfor (initializationStatement; testExpression; updateStatement) {\n// statements inside the body of loop\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#4", "metadata": {"Header 1": "Control Flow", "Header 3": "While", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#4", "page_content": "```c\nwhile (testExpression) {\n// the body of the loop\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#5", "metadata": {"Header 1": "Control Flow", "Header 3": "Do While", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#5", "page_content": "```c\ndo {\n// the body of the loop\n}\nwhile (testExpression);\n```"}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#6", "metadata": {"Header 1": "Control Flow", "Header 3": "Break and Continue", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#6", "page_content": "The `break` statement ends a loop immediately when it is encountered. The `continue` statement skips the current iteration of a loop and continues with the next iteration."}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#7", "metadata": {"Header 1": "Control Flow", "Header 3": "Goto", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#7", "page_content": "Just dont use this.... if you need it you are doing something wrong unless you have a very very special use-case. \n```c\n#include \nint main(void)\n{\nint num, i = 1;\nprintf(\"Enter the number whose table you want to print?\");\nscanf(\"%d\", &num);\ntable:\nprintf(\"%d x %d = %d\\n\", num, i, num * i);\ni++;\nif (i <= 10)\ngoto table;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/controlFlow.mdx#8", "metadata": {"Header 1": "Control Flow", "Header 3": "Switch", "path": "../pages/digitalGarden/cs/c/controlFlow.mdx", "id": "../pages/digitalGarden/cs/c/controlFlow.mdx#8", "page_content": "Important to note here is that the values of expression and each constant-expression must have an integral type and a constant-expression must have an unambiguous constant integral value at compile time. Also the break here makes sure that it doesn't fall through to the other statements. \n```c\nswitch (expression) {\ncase constant-expression-1:\n// statements\nbreak;\n\ncase constant-expression-t2:\n// statements\nbreak;\ndefault:\n// default statements\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/functions.mdx#1", "metadata": {"Header 1": "Functions", "path": "../pages/digitalGarden/cs/c/functions.mdx", "id": "../pages/digitalGarden/cs/c/functions.mdx#1", "page_content": "Just as with variables functions need to be defined before they can be used. To declare a function we use function prototypes, which include a name, the type of value it return and a list of parameters it takes. Parameters being values that the function it takes as input, also with a name and type just like variables. \nParameters are pass by value in C, meaning that a copy of the input is made on the stack which is local to the function body. Later on you can also pass by reference using pointers. \n```c\n#include \nint addNumbers(int a, int b); // function prototype\n\nint main()\n{\nint n1 = 1, n2 = 2, sum;\n\nsum = addNumbers(n1, n2);\nprintf(\"sum = %d\",sum);\n\nreturn 0;\n}\n\nint addNumbers(int a, int b) {\nreturn a + b;\n}\n\n```"}}
+{"id": "../pages/digitalGarden/cs/c/functions.mdx#2", "metadata": {"Header 1": "Functions", "Header 2": "Pass by reference", "path": "../pages/digitalGarden/cs/c/functions.mdx", "id": "../pages/digitalGarden/cs/c/functions.mdx#2", "page_content": "You might find yourself often swapping values between two variables which would lead you to implementing a swap function and your first attempt might look something like this \n```c\n#include \nvoid swap(int a, int b) {\nint temp = a;\na = b;\nb = temp;\nprintf(\"swap: a=%d, b=%d\\n\", a, b);\n}\nint main()\n{\nint a = 10;\nint b = 5;\n\nswap(a, b);\nprintf(\"main: a=%d, b=%d\\n\", a, b);\n\nreturn 0;\n}\n``` \nWhen executing the above code you will notice that the desired result was not reached due to functions in java being pass by value. To fix this we can use pointers and create functions which are pass by reference. \n```c\n#include \nvoid swap(int* a, int* b) {\nint temp = *a;\n*a = *b;\n*b = temp;\nprintf(\"swap: a=%d, b=%d\\n\", a, b);\n}\nint main()\n{\nint a = 10;\nint b = 5;\n\nswap(&a, &b);\nprintf(\"main: a=%d, b=%d\\n\", a, b);\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/functions.mdx#3", "metadata": {"Header 1": "Functions", "Header 2": "Multiple return values", "path": "../pages/digitalGarden/cs/c/functions.mdx", "id": "../pages/digitalGarden/cs/c/functions.mdx#3", "page_content": "By using pointers as so called output parameters you can have functions return more then one value."}}
+{"id": "../pages/digitalGarden/cs/c/functions.mdx#4", "metadata": {"Header 1": "Functions", "Header 2": "Pointers to functions", "Header 3": "Map", "path": "../pages/digitalGarden/cs/c/functions.mdx", "id": "../pages/digitalGarden/cs/c/functions.mdx#4", "page_content": "We can use pointer for functions for a multitude of things for example passing a function to a map function which applies the function to every element in the array. \n```c\n#include \n#include \n\n#define LENGTH 5\n\nvoid map(int a[], int len, int (*f)(int))\n{\nfor (int i = 0; i < len; i++)\n{\na[i] = f(a[i]);\n}\n}\n\nint inc(int i)\n{\nreturn i + 1;\n}\n\nint main()\n{\nint i;\nint values[LENGTH] = {88, 56, 100, 2, 25};\n\nprintf(\"Before: \");\nfor (i = 0; i < LENGTH; i++)\n{\nprintf(\"%d \", values[i]);\n}\n\nmap(values, LENGTH, inc);\n\nprintf(\"\\nAfter: \");\nfor (i = 0; i < LENGTH; i++)\n{\nprintf(\"%d \", values[i]);\n}\n\nreturn (0);\n}\n``` \n```bash filename=\"output\"\nBefore: 88 56 100 2 25\nAfter: 89 57 101 3 26\n```"}}
+{"id": "../pages/digitalGarden/cs/c/functions.mdx#5", "metadata": {"Header 1": "Functions", "Header 2": "Pointers to functions", "Header 3": "QSort", "path": "../pages/digitalGarden/cs/c/functions.mdx", "id": "../pages/digitalGarden/cs/c/functions.mdx#5", "page_content": "Another common use case is when you want to use the `qsort` function from the standard library to sort an array. \n```c void qsort(void *base, size_t nitems, size_t size, int (*compar)(const void *, const void*))``` \n```c\n#include \n#include \n\n#define LENGTH 5\n\nint compareInts(const void *a, const void *b)\n{\nreturn (*(int *)a - *(int *)b);\n}\n\nint main(void)\n{\nint i;\nint values[LENGTH] = {88, 56, 100, 2, 25};\n\nprintf(\"Before: \");\nfor (i = 0; i < LENGTH; i++)\n{\nprintf(\"%d \", values[i]);\n}\n\nqsort(values, LENGTH, sizeof(int), compareInts);\n\nprintf(\"\\nAfter: \");\nfor (i = 0; i < LENGTH; i++)\n{\nprintf(\"%d \", values[i]);\n}\n\nreturn (0);\n}\n``` \n```bash filename=\"output\"\nBefore: 88 56 100 2 25\nAfter: 2 25 56 88 100\n```"}}
+{"id": "../pages/digitalGarden/cs/c/functions.mdx#6", "metadata": {"Header 1": "Functions", "Header 2": "Macros", "path": "../pages/digitalGarden/cs/c/functions.mdx", "id": "../pages/digitalGarden/cs/c/functions.mdx#6", "page_content": "Macros are based on the define preprocessor directive, and work very similarly to functions. Just as when defining a key value pair you are limited to one line unless you use a backslash, \"\\\", at the end. Macros can be faster then normal functions because in the end they are just text substitutions and you therefore don't have the overhead when using functions like creating a new memory space. You must however be careful when using macros as they can not be debugged and because they really are just text substitute they can cause unexpected side effects. \n```c\n#include \n#define PRINT(a) \\\nprintf(\"value=%d\\n\", a);\n\n#define MAX(a, b) ((a) > (b)) ? (a) : (b)\n\nint main(void)\n{\nint a = 5;\nint b = 4;\n\nPRINT(a); // 5\nint c = MAX(++a, b); // becomes ((++a) > (b)) ? (++a) : (b)\nPRINT(c); // 7\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/functions.mdx#7", "metadata": {"Header 1": "Functions", "Header 2": "Macros", "Header 3": "Macro operators", "path": "../pages/digitalGarden/cs/c/functions.mdx", "id": "../pages/digitalGarden/cs/c/functions.mdx#7", "page_content": "We have already seen the first one in action `\\`. Another one is `defined` which can be used to check if a symbol is already defined `#if defined(DEBUG)` which is very similar to `#ifdef`. \n#### The # operator \nIf you place a # in front of a parameter in a macro definition is inserts double quotes around the actual marco argument and therefore makes it to a constant string. Strings that are separated by a white space are concatenated during preprocessing so you can do something like this \n```c\n#include \n#define PRINTINT(var) printf(#var \"=%d\\n\", var)\n\nint main(void)\n{\nint count = 100;\nPRINTINT(count); // printf(\"count\" \"=%d\\n\", count); -> printf(\"count=%d\\n\", count);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/introduction.mdx#1", "metadata": {"Header 1": "Introduction to C", "Header 2": "History", "path": "../pages/digitalGarden/cs/c/introduction.mdx", "id": "../pages/digitalGarden/cs/c/introduction.mdx#1", "page_content": "The C programming language was originally developed at Bell Labs (like so many other things) by Dennis Ritchie in 1972 to construct utilities running on Unix. Later on, it was also used to re-implement the entire kernel of the Unix operating system and has till this day been the kernel language for Unix. In 1989 C was standardized by ANSI (American National Standards Institute) to so-called ANSI C, also known as C89. Later on, in the same year, it was then adopted by the International Organization for Standardization (ISO) to create the so-called Standard C. Over the years ISO has published new standards corresponding to the years in which they were published: C90, C99, C11, C17 and now they are working on C2x."}}
+{"id": "../pages/digitalGarden/cs/c/introduction.mdx#2", "metadata": {"Header 1": "Introduction to C", "Header 2": "Running a C Program", "path": "../pages/digitalGarden/cs/c/introduction.mdx", "id": "../pages/digitalGarden/cs/c/introduction.mdx#2", "page_content": "When writing a C program there are 4 phases of programming: \n- Editing: Writing and modifying your code.\n- Compiling: This part is split into 2 phases:\n- Preprocessing: The code can still be modified by the compiler.\n- Compilation: The compiler checks the syntax and semantics of your code and makes sure everything is in order. The compiler then translates the code to assembly language which is then further translated into actual machine code/instructions. These machine instructions are then stored in object files that have either the extension `.obj` or `.o`.\n- Linking: The goal of this phase is to get the program into its final form for execution. The linker combines the object modules with additional libraries needed by the program to create the entire executable file which can then be run.\n- Running: This is the final phase and is self-explanatory. \nFor more about the compilation and linking phase check out this [article](https://medium.com/@bdov_/what-happens-when-you-type-gcc-main-c-a4454564e96d)."}}
+{"id": "../pages/digitalGarden/cs/c/introduction.mdx#3", "metadata": {"Header 1": "Introduction to C", "Header 2": "Running a C Program", "Header 3": "First C Program", "path": "../pages/digitalGarden/cs/c/introduction.mdx", "id": "../pages/digitalGarden/cs/c/introduction.mdx#3", "page_content": "```c filename=\"helloWorld.c\"\n#include \"stdio.h\"\n\nint main(void)\n{\nprintf(\"Hello World\"); // prints \"Hello World\"\nreturn 0;\n}\n``` \nTo then compile and run our \"Hello World\" we can use for example the GNU Compiler Collection (gcc). \n```bash\nfoo@bar:~$ gcc -std=c11 -pedantic -pedantic-errors -Wall -Wextra -g -o helloWorld helloWorld.c\nfoo@bar:~$ helloWorld.exe\nHello World\n``` \nThe options mean the following: \n- `-std=c11` use C11 standard or can you use `-std=c89`, `-std=c99`, `-ansi`.\n- `-pedantic` use strict ISO C warnings.\n- `-pedantic-errors` use strict ISO C errors.\n- `-Wall` shows all warnings.\n- `-Wextra` turn on extra warnings.\n- `-g` activates the debugger.\n- `-o` the name of the executable file."}}
+{"id": "../pages/digitalGarden/cs/c/introduction.mdx#4", "metadata": {"Header 1": "Introduction to C", "Header 2": "Commenting", "path": "../pages/digitalGarden/cs/c/introduction.mdx", "id": "../pages/digitalGarden/cs/c/introduction.mdx#4", "page_content": "Comments are the same as in many other programming languages like Java (inspired by C/C++). A single-line comment starts with `//` and is written over a line. A multi-line comment is written between `/* ... */` and can span over multiple lines."}}
+{"id": "../pages/digitalGarden/cs/c/introduction.mdx#5", "metadata": {"Header 1": "Introduction to C", "Header 2": "Include", "path": "../pages/digitalGarden/cs/c/introduction.mdx", "id": "../pages/digitalGarden/cs/c/introduction.mdx#5", "page_content": "The `#include ` is a preprocessor directive that tells the compiler that we want something done in the preprocessing phase. All preprocessor directives start with a `#`. The \"include\" instruction tells the compiler to include the contents of the \"stdio.h\" file. You might notice it has the `.h` extension which means it is a header file. Header files define information about functions, so-called function prototypes which describe a function so the functions name, its arguments etc. The file we are including stands for standard input output which is part of the C standard library and we use the `printf` function of that file to write to the standard output, which is by default the console. \nWhen specifying the file to be included you can either write it between double quotes or angle brackets. The difference between these two forms is subtle but important. If a header file is included using < >, the preprocessor will search a predetermined directory path to locate the header file (the folder for the standard library). If the header file is enclosed in \"\", the preprocessor will look first for the header file in the same directory as the source file and then in the other folders."}}
+{"id": "../pages/digitalGarden/cs/c/pointers.mdx#1", "metadata": {"Header 1": "Pointers", "path": "../pages/digitalGarden/cs/c/pointers.mdx", "id": "../pages/digitalGarden/cs/c/pointers.mdx#1", "page_content": "Every variable is a memory location and every memory location has an address which can be accessed using the address-of operator, `&`. A pointer is a variable whose value is the address of another variable. This address is internally represented as an unsigned int on most systems however you shouldn't think of it as such (you can output it in hex using the %p format specifier). Every pointer has the type of the variable it is pointing to so the compiler knows how much memory is occupied by that variable. The `*` denotes a pointer. You can define and initialize a pointer by pointing to no location in memory with `NULL`, a so-called null pointer, which is also equivalent to 0. \n \nTo access the value the variable holds which a pointer is pointing we can dereference the pointer by using `*` again. \nPointers are also stored in memory so they also have addresses so it is possible to output them as well. &pnumber warning by compiler because expected a pointer but it is a pointer to a pointer of itn so cast to void\\*??? \n\nYou should never dereference an uninitialized pointer as if you assign it a value it could go anywhere. You could maybe overwrite data or even cause the program to crash!\n \n```c\n#include \nint main()\n{\nint var = 5;\nint* p_var = &var;\nprintf(\"var=%d and it's address is %p\\n\", var, (void*)&var);\nprintf(\"p_var=%p and it's address is %p and the value it points to is %d\"\n, (void*)p_var, (void*)&p_var, *p_var);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/pointers.mdx#2", "metadata": {"Header 1": "Pointers", "Header 2": "Void pointers", "path": "../pages/digitalGarden/cs/c/pointers.mdx", "id": "../pages/digitalGarden/cs/c/pointers.mdx#2", "page_content": "A void pointer can store an address of any type and can not be dereferenced as it doesn't know the size of the type it is pointing to so you must first cast it to another pointer type if you want to do so. \n```c\n#include\nint main()\n{\nint a = 10;\nvoid *ptr = &a;\nprintf(\"Address of a is %p\\n\", ptr);\nprintf(\"Value of a is %d\", *((int*)ptr));\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/pointers.mdx#3", "metadata": {"Header 1": "Pointers", "Header 2": "Const pointers", "path": "../pages/digitalGarden/cs/c/pointers.mdx", "id": "../pages/digitalGarden/cs/c/pointers.mdx#3", "page_content": "There are 3 ways you can use the `const` keyword with pointers all having different results. \nWhen `const` is written before the type it defines a pointer to a constant value meaning we cant change the value via dereferencing. If the variable the pointer points isn't defined as a constant then the value can still be changed. \n```c\n#include\nint main()\n{\nint val = 10;\nconst int* pointer = &val;\n// *pointer = 3; this does not work\nval = 4; // this however still does\npointer = &val;\nprintf(\"%d\", *pointer); // 4\n}\n``` \nWhen `const` is written in between the type and identifier you can not change the address the pointer points to, however you can still the change the value of the variable as this has no effect on the address. \n```c\n#include\nint main()\n{\nint val = 10;\nint* const pointer = &val;\nint otherVal = 3;\n// *pointer = &otherVal; this does not work\n*pointer = 5; // this however still works\nprintf(\"%d\", *pointer); // 4\n}\n``` \nYou can then also combine these two concepts."}}
+{"id": "../pages/digitalGarden/cs/c/pointers.mdx#4", "metadata": {"Header 1": "Pointers", "Header 2": "Double pointers (pointers to pointers)", "path": "../pages/digitalGarden/cs/c/pointers.mdx", "id": "../pages/digitalGarden/cs/c/pointers.mdx#4", "page_content": "You can in theory also go further then double but its just becomes a mess and shouldn't be done. \nA common use case for double pointers is if you want to preserve the Memory-Allocation or Assignment even outside of a function call. \n```c\n#include \n#include \n\nvoid foo(int **p)\n{\nint a = 5;\n*ptr = &a;\n}\n\nint main(void)\n{\nint *p = NULL;\np = malloc(sizeof(int));\n*p = 42;\nfoo(&p);\nprintf(\"%d\\n\", *p); // 5 not 42\nfree(p);\np = NULL;\n\nreturn 0;\n}\n``` \nanother common use case is when working with strings. \n```c\nint wordsInSentence(char **s) {\nint w = 0;\nwhile (*s) {\nw++;s++;\n}\nreturn w;\n}\n\nint main(void)\n{\nchar *word = \"foo\";\nchar **sentence;\n\nsentence = malloc(4 * sizeof *sentence); // assume it worked\nsentence[0] = word;\nsentence[1] = word;\nsentence[2] = word;\nsentence[3] = NULL;\n\nprintf(\"total words in my sentence: %d\\n\", wordsInSentence(sentence));\n\nfree(sentence);\nfree(word);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#1", "metadata": {"Header 1": "Standard File I/O", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#1", "page_content": "To interact with files in C you need to have a FILE pointer a so called stream, which will let the program keep track of the memory address of the file being accessed. In C text files are sequence of characters as lines each ending with a newline (\\n). Interestingly you have already been working with file I/O since the begging as C automatically opens 3 files, the standard input (keyboard), standard output and error (both being the display). You have read from and written to these files using `scanf()` and `printf()`."}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#2", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Opening and closing files", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#2", "page_content": "The FILE pointer points to the FILE struct which represents a stream. To be able to open a file and get a FILE pointer you need to use the `FILE *fopen(const char *restrict pathname, const char *restrict mode)` function which takes the name of the file and the mode to open it with, depending on the mode certain operations are limited. The function returns a FILE pointer or if something went wrong NULL. Once you have finished working with the file or you have reached the end of the file marked with EOF (end of file, equivalent to -1) you should close it with the `int fclose(FILE *stream)` function."}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#3", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Opening and closing files", "Header 3": "Modes", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#3", "page_content": "- r - Open for reading. If the file does not exist returns NULL.\n- w - Open for writing. If the file exists, its contents are overwritten. If the file does not exist, it will be created.\n- a - Open for append. Data is added to the end of the file. If the file does not exist, it will be created.\n- r+ - Open for both reading and writing. If the file does not exist, returns NULL.\n- w+ - Open for both reading and writing. If the file exists, its contents are overwritten. If the file does not exist, it will be created.\n- a+ - Open for both reading and appending. If the file does not exist, it will be created."}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#4", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Buffers", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#4", "page_content": "Characters that are written to a stream are normally accumulated and sent to the file in a block. Similarly, streams often retrieve input from the host environment in blocks rather per character. This is called buffering. \nThere are three different kinds of buffering strategies: \n- Characters written to or read from an **unbuffered stream** are transmitted individually to or from the file as soon as possible.\n- Characters written to a **line buffered stream** are transmitted to the file in blocks when a newline character is encountered.\n- Characters written to or read from a **fully buffered stream** are transmitted to or from the file in blocks of arbitrary size. \nFlushing output on a buffered stream means transmitting all accumulated characters to the file. Streams can automatically flush when following happens: \n- When you try to do output and the output buffer is full.\n- When the stream is closed.\n- When the program terminates by calling exit.\n- When a newline is written, if the stream is line buffered. \nIf you want to explicitly flush the buffered output you can use `int fflush (FILE *stream)`. \nYou can bypass the stream buffering facilities altogether by using the POSIX input and output functions that operate on file descriptors instead."}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#5", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Reading", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#5", "page_content": "To read from a text file you have 3 options: \n- `int fscanf(FILE *restrict stream, const char *restrict format, ...)`, the file alternative of scanf.\n- `int fgetc(FILE *stream)`, reads a single char (returns its int value) and increments the header by one.\n- `char *fgets(char *restrict s, int n, FILE *restrict stream)`, reads an entire line as a string and keeps the newline at the end. \n```c\n#include \n\nint main(void){\nchar *fileName = \"./text.txt\";\nFILE *file = fopen(fileName, \"r\");\nif(file == NULL){\nprintf(\"Failed to open %s file!\\n\", fileName);\nreturn -1;\n}\n\nchar c = '\\0';\nwhile(c != EOF){\nc = fgetc(file);\nprintf(\"%c\", c);\n}\nprintf(\"\\n\");\nrewind(file); // reset position to start\n\nchar str [100];\nwhile(fgets(str, 100, file)){\nprintf(\"%s\\n\", str);\n}\n\nrewind(file); // reset position to start\n\nint bananaCount = 0;\nfscanf(file, \"%s %s %d %s\", str, str, &bananaCount, str);\nprintf(\"bananaCount=%d\\n\", bananaCount);\n\nfclose(file);\nreturn 0;\n}\n``` \n```text filename=\"text.txt\"\nI have 3 bananas\n```"}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#6", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Writing", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#6", "page_content": "Very similar to reading you have the following functions for writing \n- `int fprintf(FILE *restrict stream, const char *restrict format, ...)`\n- `int fputc(int c, FILE *stream)`\n- `int fputs(const char *restrict s, FILE *restrict stream)` \nImportant to note here is that the null terminator, '\\0' will not be written."}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#7", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Positioning", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#7", "page_content": "The file position of a stream describes where in the file the stream is currently reading or writing. \n- `long int ftell (FILE *stream)` returns the current file position of the stream.\n- `int fseek (FILE *stream, long int offset, int whence)` is used to change the file position of the stream. The value of whence must be one of the constants SEEK_SET, SEEK_CUR, or SEEK_END, to indicate whether the offset is relative to the beginning of the file, the current file position, or the end of the file."}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#8", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Rename and move", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#8", "page_content": "You can rename or move files with `int rename(const char *old_filename, const char *new_filename)`. The function moves the file in between directories if needed so it can also be used to move files."}}
+{"id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#9", "metadata": {"Header 1": "Standard File I/O", "Header 2": "Remove", "path": "../pages/digitalGarden/cs/c/standardFileIO.mdx", "id": "../pages/digitalGarden/cs/c/standardFileIO.mdx#9", "page_content": "To remove/delete a file you can use `int remove(const char *pathname)`. if the file really is a file then `int unlink(const char *pathname)` will be called and if it is a directory `int rmdir(const char *pathname)` is called."}}
+{"id": "../pages/digitalGarden/cs/c/strings.mdx#1", "metadata": {"Header 1": "Strings", "path": "../pages/digitalGarden/cs/c/strings.mdx", "id": "../pages/digitalGarden/cs/c/strings.mdx#1", "page_content": "String are stored and can be handled as arrays of chars which is why you often hear character array instead of string. In C the compiler adds at the end of each string literal the null character, '\\0' (not to be confused with NULL) so it knows where the string ends. This also means that the length of a string is always one longer then you might think it is. To get the length of a string you can implement your own function or use the built in function `strlen` provided in `string.h`. \n```c\n#include \n#include \n\nsize_t getStringLength(char *str)\n{\nsize_t count;\nfor (count = 0; str[count] != '\\0'; ++count)\n;\nreturn count;\n}\n\nint main()\n{\nchar a[6] = {'h', 'e', 'l', 'l', 'o', '\\0'};\nchar b[] = {'h', 'e', 'l', 'l', 'o', '\\0'};\nchar c[] = \"hello\"; // string literal\nchar d[] = {\"hello\"};\nchar e[50] = \"hello\"; // to long\nchar f[5] = \"hello\"; // to short, '\\0' is not added so carefull...\n\nprintf(\"%s length=%ld strlen=%ld\\n\", a, getStringLength(a), strlen(a));\nprintf(\"%s length=%ld strlen=%ld\\n\", b, getStringLength(b), strlen(b));\nprintf(\"%s length=%ld strlen=%ld\\n\", c, getStringLength(c), strlen(c));\nprintf(\"%s length=%ld strlen=%ld\\n\", d, getStringLength(d), strlen(d));\nprintf(\"%s length=%ld strlen=%ld\\n\", e, getStringLength(e), strlen(e));\nprintf(\"%s length=%ld strlen=%ld\\n\", f, getStringLength(f), strlen(f));\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nhello length=5 strlen=5\nhello length=5 strlen=5\nhello length=5 strlen=5\nhello length=5 strlen=5\nhello length=5 strlen=5\nhellohello length=10 strlen=10\n```"}}
+{"id": "../pages/digitalGarden/cs/c/strings.mdx#2", "metadata": {"Header 1": "Strings", "Header 2": "Wide characters", "path": "../pages/digitalGarden/cs/c/strings.mdx", "id": "../pages/digitalGarden/cs/c/strings.mdx#2", "page_content": "A wide character `wchar_t` is similar to char data type, except that it takes up twice the space and can take on much larger values as a result. A char can take 256 values which corresponds to entries in the ascii table. On the other hand, wide char can take on 65536 values which corresponds to unicode values. So whenever you see a function that has to do with strings or characters and there is a w then it most lightly has to do with wide characters. You can create wide string litrals just like normal string literal and then by adding the L prefix. \n```c\n#include \n\nint main()\n{\nwchar_t w = L'A';\nwchar_t *p = L\"Hello!\";\nsize_t length = wcslen(p); // can't use strlen\nprintf(\"Wide character: %c\\n\", w);\nprintf(\"Wide string with length %ld: %S\\n\", length, p);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/strings.mdx#3", "metadata": {"Header 1": "Strings", "Header 2": "String functions", "Header 3": "Converting", "path": "../pages/digitalGarden/cs/c/strings.mdx", "id": "../pages/digitalGarden/cs/c/strings.mdx#3", "page_content": "```c\n#include \n#include \n\nint main()\n{\nchar number[] = \"123.321 hello\";\nchar *restL;\nchar *restF;\n\nint i = atoi(number); // returns 0 if fails\nlong l = atol(number);\ndouble f = atof(number);\n\nstrtol(number, &restL, 10); // third parameter is base\nstrtod(number, &restF);\n\nprintf(\"%s = %d\\n\", number, i);\nprintf(\"%s = %ld\\n\", number, l);\nprintf(\"%s = %f\\n\", number, f);\nprintf(\"%s\\n\", restL);\nprintf(\"%s\\n\", restF);\nreturn 0;\n}\n``` \n```bash filename=\"output\"\n123.321 hello = 123\n123.321 hello = 123\n123.321 hello = 123.321000\n.321 hello\nhello\n```"}}
+{"id": "../pages/digitalGarden/cs/c/strings.mdx#4", "metadata": {"Header 1": "Strings", "Header 2": "String functions", "Header 3": "Comparing", "path": "../pages/digitalGarden/cs/c/strings.mdx", "id": "../pages/digitalGarden/cs/c/strings.mdx#4", "page_content": "```c\n#include \n#include \n\nint main(void)\n{\nchar str1[] = \"Hello Earth\";\nchar str2[] = \"Hello World\";\n\nprintf(\"Compare %s with %s = %d\\n\", str1, str2, strcmp(str1, str2));\nprintf(\"Compare first 5 letters of %s with %s = %d\\n\", str1, str2, strncmp(str1, str2, 5));\n\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nCompare Hello Earth with Hello World = -18\nCompare first 5 letters of Hello Earth with Hello World = 0\n```"}}
+{"id": "../pages/digitalGarden/cs/c/strings.mdx#5", "metadata": {"Header 1": "Strings", "Header 2": "String functions", "Header 3": "Analyzing", "path": "../pages/digitalGarden/cs/c/strings.mdx", "id": "../pages/digitalGarden/cs/c/strings.mdx#5", "page_content": "```c\n#include \n#include \n\nint main()\n{\nchar c = 'c';\n\nprintf(\"isLower: %d\\n\", islower(c));\nprintf(\"isUpper: %d\\n\", isupper(c));\nprintf(\"isAlpha: %d\\n\", isalpha(c)); // a-Z or A-Z adds 2 if lower, 1 if upper\nprintf(\"isDigit: %d\\n\", isdigit(c)); // 0-9\nprintf(\"isAlphanumeric: %d\\n\", isalnum(c)); // a-Z or A-Z or 0-9 adds 2 if lower, 1 if upper\nprintf(\"isWhitespace: %d\\n\", isspace(c));\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nisLower: 1\nisUpper: 0\nisAlpha: 2\nisDigit: 0\nisAlphanumeric: 2\nisWhitespace: 0\n```"}}
+{"id": "../pages/digitalGarden/cs/c/structures.mdx#1", "metadata": {"Header 1": "Structures", "path": "../pages/digitalGarden/cs/c/structures.mdx", "id": "../pages/digitalGarden/cs/c/structures.mdx#1", "page_content": "In C structures defined using the `struct` keyword are a very important concept as they allow for grouping of elements very similarly to classes in other languages they just don't include functions. For example a date, month, day, year. can then create variables as type struct date. memory is allocated 3 variables inside. can access member variables with. so `today.year` for example can also assign initialcompound literal can assign values after initilation like (struct date) `{1,2,3}` or specify the specific values with .month=9for only one time thing. can initialize structs like arrays with `{7,2,2015}`. or just the frist 2 or can do `{.month=12}`"}}
+{"id": "../pages/digitalGarden/cs/c/structures.mdx#2", "metadata": {"Header 1": "Structures", "Header 2": "Unnamed structs", "path": "../pages/digitalGarden/cs/c/structures.mdx", "id": "../pages/digitalGarden/cs/c/structures.mdx#2", "page_content": "Unnamed structures can be used if you know that you only need one instance of it at all times which can be useful for constants. \n```c\nstruct /* No name */ {\nfloat x;\nfloat y;\n} point;\n\npoint.x = 42;\n```"}}
+{"id": "../pages/digitalGarden/cs/c/structures.mdx#3", "metadata": {"Header 1": "Structures", "Header 2": "Array of structs", "path": "../pages/digitalGarden/cs/c/structures.mdx", "id": "../pages/digitalGarden/cs/c/structures.mdx#3", "page_content": "Can then do all the normal things you would expect to be able to do with an array. \n```c\nstruct Student\n{\nint rollNumber;\nchar studentName[10];\nfloat percentage;\n};\nstruct Student studentRecord[5];\n```"}}
+{"id": "../pages/digitalGarden/cs/c/structures.mdx#4", "metadata": {"Header 1": "Structures", "Header 2": "Nested structs", "path": "../pages/digitalGarden/cs/c/structures.mdx", "id": "../pages/digitalGarden/cs/c/structures.mdx#4", "page_content": "A nested structure in C is a structure within structure. One structure can be declared inside another structure in the same way structure members are declared inside a structure. \n```c\nstruct Date\n{\nint day;\nint month;\nint year;\n};\nstruct Time\n{\nint hours;\nint minutes;\nint seconds;\n};\nstruct DateTime\n{\nstruct Date date;\nstruct Time time;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/structures.mdx#5", "metadata": {"Header 1": "Structures", "Header 2": "Pointers to structs", "path": "../pages/digitalGarden/cs/c/structures.mdx", "id": "../pages/digitalGarden/cs/c/structures.mdx#5", "page_content": "You can have pointers to struct variables. The important thing to know here is that there is a shorthand for accessing the data by usign the `-\\>` operator. \n```c\n#include\n\nstruct dog\n{\nchar name[10];\nchar breed[10];\nint age;\nchar color[10];\n};\n\nint main()\n{\nstruct dog my_dog = {\"tyke\", \"Bulldog\", 5, \"white\"};\nstruct dog *ptr_dog;\nptr_dog = &my_dog;\n\nprintf(\"Dog's name: %s\\n\", (*ptr_dog).name); // instead of having to do this\nprintf(\"Dog's breed: %s\\n\", ptr_dog->breed); // you can do this\nprintf(\"Dog's age: %d\\n\", ptr_dog->age);\nprintf(\"Dog's color: %s\\n\", ptr_dog->color);\n\n// changing the name of dog from tyke to jack\nstrcpy(ptr_dog->name, \"jack\");\n\n// increasing age of dog by 1 year\nptr_dog->age++;\n\nprintf(\"Dog's new name is: %s\\n\", ptr_dog->name);\nprintf(\"Dog's age is: %d\\n\", ptr_dog->age);\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/structures.mdx#6", "metadata": {"Header 1": "Structures", "Header 2": "typdef", "path": "../pages/digitalGarden/cs/c/structures.mdx", "id": "../pages/digitalGarden/cs/c/structures.mdx#6", "page_content": "The `typedef` keyword is used in C to assign alternative names to existing datatypes. This can be especially powerfull when combined with structs.can be used to give a type a new name. so typedef unsigned char BYTE; BYTE can then be used as an allias. this can become very powerful with structs. \n```c\n#include \n\ntypedef struct Point\n{\ndouble x;\ndouble y;\n} Point; // can have the same name\n\nstruct date\n{\nunsigned short day;\nunsigned short month;\nunsigned int year;\n};\ntypedef struct date Date;\n\ntypedef unsigned char byte;\n\nint main(void)\n{\nPoint origin = {0, 0};\nstruct date today = {1, 4, 2022};\nDate tomorrow = {2, 4, 2022};\nbyte intSize = sizeof(int);\n\nprintf(\"The origin is: (%f/%f)\\n\", origin.x, origin.y);\nprintf(\"Today is %d/%d/%d\\n\", today.day, today.month, today.year);\nprintf(\"Tommorrow is %d/%d/%d\\n\", tomorrow.day, tomorrow.month, tomorrow.year);\nprintf(\"On my computer an int takes up %d bytes.\\n\", intSize);\n\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nThe origin is: (0.000000/0.000000)\nToday is 1/4/2022\nTommorrow is 2/4/2022\nOn my computer an int takes up 4 bytes.\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#1", "metadata": {"Header 1": "Variables and Data Types", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#1", "page_content": "To create a variable you first need a name. A variable name must start with a letter or underscore and then be followed by any combination of letters, underscores or digits as long as the name in the end isn't a reserved word like \"int\". Secondly you need a type which defines how the data is stored in memory for example `int` is an integer. When writing `int x;` you are declaring a variable which reserves the space in memory to hold the later on assigned value as it knows the amount of bytes the data type of the variable needs. Initializing a variable is giving it an initial value. This can be done as part of the declaration like for example `int a = 12;`."}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#2", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Basic Data Types", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#2", "page_content": "[A basic list of c data types](https://en.wikipedia.org/wiki/C_data_types#Main_types), how much memory they take up but this is very computer and compiler dependant. Closly tied to the amount of memory is of course the value range of a variable. These ranges can be checked by including [limits.h](https://pubs.opengroup.org/onlinepubs/007904975/basedefs/limits.h.html) for integer values and [float.h](https://pubs.opengroup.org/onlinepubs/007904975/basedefs/float.h.html) for float values (part of the standard library). \nSome interesting things to note are: \n- You can assign values using hex so `int x = 0xFFFFFF` is possible.\n- You can use scientific notation to assign values so `float x = 1.7e4` is possible.\n- You can add short, long, signed and unsigned to numerical values. `short` **might** make the types memory usage smaller, `long` **might** make it larger. `signed` is by default so has no real effect, `unsigned` means its range is only positive values and includes 0. Unless it is `int` itself the word int can be omitted so `long int` and `long` are the same. For some reason you can also do `long long` and `short short` who knows why?\n- If you want specific sized data types you can include from the standard library [stdint.h](https://pubs.opengroup.org/onlinepubs/009696899/basedefs/stdint.h.html) as to why to this doesn't exist for floats you can [read here](https://www.reddit.com/r/cpp/comments/34d7b6/why_do_we_have_intn_t_but_no_equivalent_for/)."}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#3", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Enums", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#3", "page_content": "An enum is a data type that only allows specific values. These values are under the hood mapped to integer constants with the first being mapped to 0 then the next is 0++ etc. if nothing else is specified. \n```c\n#include \n\nint main(void)\n{\nenum planets {mercury, venus, earth, mars, jupiter, saturn, uranus};\nenum planets home = earth;\nprintf(\"Our home is the %d. planet from the sun.\\n\", home+1);\n\nenum days {monday=1, tuesday, wednesday, thursday, friday, saturday=10, sunday};\nprintf(\"Level of motivation on a monday is %d and on a tuesday %d.\\n\", monday, tuesday);\nprintf(\"On saturday it is %d and sunday %d\\n.\", saturday, sunday);\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#4", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Boolean Types", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#4", "page_content": "To have variables with boolean values in C we can use the `_Bool` data type which can have the values 0 (false) or 1 (true). \n```c\n#include \n\nint main(void)\n{\n_Bool x = 1;\n_Bool y = 0;\nif(x) {\nprintf(\"This will print!\");\n} if (!y)\n{\nprintf(\"This will also print!\");\n}\n}\n``` \nAnother way would be to use an Enum with a typedef, this takes advantage of Enums being constant integer values under the hood. \n```c\n#include \ntypedef enum { FALSE, TRUE } Boolean;\n\nint main(void)\n{\nBoolean x = TRUE;\nBoolean y = 0;\nif (x) {\nprintf(\"This will print!\");\n}\nif (!y) {\nprintf(\"This will also print!\");\n}\n}\n``` \nFrom C99 onwards you can also `#include `. \n```c\n#include \n#include \n\nint main(void)\n{\nbool x = true;\nbool y = 0;\nif (x) {\nprintf(\"This will print!\");\n}\nif (!y) {\nprintf(\"This will also print!\");\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#5", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Format Specifiers", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#5", "page_content": "There are lots format specifiers for outputting different data types. You can also use format specifiers to do cool things like adding leading or trailing zeros or only showing a certain amount of decimal points. \n \n```c\n#include \n\nint main(void)\n{\nprintf(\"Characters: %c %c \\n\", 'a', 65);\nprintf(\"Preceding with blanks: %10d \\n\", 1977);\nprintf(\"Preceding with zeros: %010d \\n\", 1977);\nprintf(\"Some different radices: %d %x %o %#x %#o \\n\", 100, 100, 100, 100, 100);\nprintf(\"floats: %.2f %+.0e %E \\n\", 3.1416, 3.1416, 3.1416);\nprintf(\"Width trick: %*d \\n\", 20, 10);\nprintf(\"%s\", 0 ? \"true\" : \"false\");\nreturn 0;\n}\n``` \nYou can find more details in the [documentation of printf](https://www.cplusplus.com/reference/cstdio/printf/)."}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#6", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Visibility", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#6", "page_content": "All identifiers (variables, functions, classes etc.) must be defined before they can be used . Depending on where the identifier is defined they identifier has has a different visibility. Identifiers in the same block must be ambiguous and are visible in the inner blocks. An identifier from an outer block can be redefined in an inner block and can therefore be shadowed. \n```c\n#include \nint main(void)\n{\nint x = 6\n{\nint x = 9;\n{\nint x = 10;\nprintf(\"%d\", x) // 10\n}\nprintf(\"%d\", x) // 9\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#7", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Visibility", "Header 3": "Global", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#7", "page_content": "If you define a variable outside all blocks then it is part of the global scope and exists as long as the program runs and can be accessed between multiple files by including the header file where it is defined and adding the `extern` keyword before it. \n```c filename=\"main.c\"\n#include \n\nint x = 5; // global\n\nint main(void)\n{\nprintf(\"%d\", x) // 5\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#8", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Visibility", "Header 3": "Static", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#8", "page_content": "By adding the `static` keyword to the global variable we can limit it's visibility to just this file."}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#9", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Dynamically Allocated Memory", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#9", "page_content": "We have seen above that a lot of objects are only available inside their blocks once the block is finished they are removed. These objects are stored on the stack. If we create an object in a function we can return it and still work with it, however in C the object is copied on return which can very bad for performance if the object is very large. \nObjects can also be stored statically meaning they are available as long as the program runs. \n```c\n#include\nint inc()\n{\nstatic int count = 0;\ncount++;\nreturn count;\n}\n\nint main()\n{\nprintf(\"%d \", inc()); // 1\nprintf(\"%d \", inc()); // 2\nreturn 0;\n}\n``` \nThe last possibility is using dynamically allocated memory. In C you can not define an array with a certain size at runtime, if we would want to do something like that we would need dynamic memory allocation. To be able to use this in C you must include `stdlib.h`. To go back on our problem of returning a created object from a function we can create the object on the stack and then just return the address of the object."}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#10", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Dynamically Allocated Memory", "Header 3": "Malloc", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#10", "page_content": "The `malloc()` function, short for \"memory allocation\", is used to dynamically allocate a single large block of memory with the specified size and returns a pointer to the block. \n:::warning\nWhen the memory is no longer needed you should release it using the `free()` function as you are otherwise using unnecessary memory. If you call free multiple times you can cause unexpected behavior which is why you should also set the pointer to NULL.\n::: \n:::warning\nMalloc does not initialize the memory!\n::: \n```c\n#include \n#include \n\nint main()\n{\nint* ptr;\nint n, i;\n\nprintf(\"Enter number of elements:\");\nscanf(\"%d\",&n);\nprintf(\"Entered number of elements: %d\\n\", n);\n\n// Dynamically allocate memory using malloc()\nptr = (int*)malloc(n * sizeof(int)); // returns void*\n\nif (ptr) {\nprintf(\"Memory successfully allocated using malloc.\\n\");\n\nfor (i = 0; i < n; ++i) {\nptr[i] = i + 1;\n}\n\nprintf(\"The elements of the array are: \");\nfor (i = 0; i < n; ++i) {\nprintf(\"%d, \", ptr[i]);\n}\n}\n\nfree(ptr); // free the memory!!\nptr = NULL;\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#11", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Dynamically Allocated Memory", "Header 3": "Calloc", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#11", "page_content": "The `calloc()` function, short for \"contiguous allocation\", is very similiar to the malloc function however it dynamically allocates the specified number of blocks of memory of the specified type. The most important difference however is that it initializes each block with a default value of '0'. \n```c\n#include \n#include \n\nint main()\n{\nint* ptr;\nint n, i;\n\nprintf(\"Enter number of elements:\");\nscanf(\"%d\",&n);\nprintf(\"Entered number of elements: %d\\n\", n);\n\n// Dynamically allocate memory using calloc()\nptr = (int*)calloc(n, sizeof(int));\n\nif (ptr) {\nprintf(\"Memory successfully allocated using calloc.\\n\");\n\nfor (i = 0; i < n; ++i) {\nptr[i] = i + 1;\n}\n\nprintf(\"The elements of the array are: \");\nfor (i = 0; i < n; ++i) {\nprintf(\"%d, \", ptr[i]);\n}\n}\n\nfree(ptr); // free the memory!!\nptr = NULL;\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#12", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Constant Values", "Header 3": "Define", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#12", "page_content": "With `#define` you can define key value pairs that will be substituted in the preprocessing phase. Important is that you don't write a type, an equal or a semicolon! \n```c\n#include \n#define PI 3.14\n\nint main(void)\n{\nprintf(\"%f\",PI);\nreturn 0;\n}\n``` \nYou can also conditionally define variables depending on certain compiler arguments or environment variables. \n```c\n#include \n\n#define X 2\n\n#if X == 1\n#define Y 1\n#elif X==2\n#define Y 2\n#else\n#define Y 3\n#endif\n\nint main(void)\n{\nprintf(\"%d\",Y);\nreturn 0;\n}\n``` \nYou can also execute certain code by checking if something is defined or not. \n```c\n#include \n#define UNIX 1\n\nint main()\n{\n#ifdef UNIX\nprintf(\"UNIX specific function calls go here.\\n\");\n#endif\nprintf(\"C has some weird things.\\n\");\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#13", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Constant Values", "Header 3": "Const", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#13", "page_content": "In C90 the `const` keyword was added which does not allow the value of a variable to change, making it read-only. Using const is much more flexible then define as allows you to use a data type and it is also better for performance. \n```c\n#include \n\nint main(void)\n{\nconst int PI = 3.14;\nprintf(\"%f\",PI);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#14", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Operators", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#14", "page_content": "Has the same operators as in many other languages and also work the same so not gonna go into detail. The only interesting ones to go into are below. \n"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#15", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Operators", "Header 3": "Casting", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#15", "page_content": "conversion between different types can happen automatically (implicit) or has to be done explicit. \nfor example double to flaot is implicit as no data is lost however double to int data is lost so it ahs to be done explicit and the decimal points are truncated \n(int) 25.1 + (int) 27.435"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#16", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Operators", "Header 3": "Sizeof", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#16", "page_content": "The sizeof operator is very simple and just outputs how many bytes a data type or variable takes up. \n```c\nint x = 3;\n\nprintf(\"An int takes up %ld bytes on my computer and a double %ld\", sizeof(x), sizeof(double)); // 4 and 8\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#17", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Command-line Arguments", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#17", "page_content": "When compiling you can pass arguments to the main function. The first parameter `argc` is the argument count, the second parameter `argv` is the argument vector which is an array of strings. So in other words it is an array of character arrays or an array of character pointers. \n```c filename=\"main.c\"\n#include \n\nint main(int argc, char *argv[])\n{\nprintf(\"argc=%d\\n\", argc);\n\n// the first argument is the name of the executable\nprintf(\"exe name=%s\\n\", argv[0]);\n\nfor (int i = 1; i < argc; i++)\n{\nprintf(\"argv[%d]=%s\\n\", i, argv[i]);\n}\n\nreturn 0;\n}\n``` \nTo then pass arguments you can do the following \n```bash\nfoo@bar:~$ gcc -std=c11 -pedantic -pedantic-errors -Wall -Wextra -g -o argvExample main.c\nfoo@bar:~$ argvExample.exe arg1 arg2\nargc=3\nexe name=./argvExample\nargv[1]=arg1\nargv[2]=arg2\n```"}}
+{"id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#18", "metadata": {"Header 1": "Variables and Data Types", "Header 2": "Inputting Data", "path": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx", "id": "../pages/digitalGarden/cs/c/variablesDataTypes.mdx#18", "page_content": "The `stdio.h` file contains the `scanf()` function which reads input from the standard input stream \"stdin\", which by default is the console. The function can read and parse the input using the provided format specifier. Important to know is that it uses whitespaces to tokenize the input. \n```c filename=\"main.c\"\n#include \"stdio.h\"\n\nint main(void)\n{\nchar str[100];\nint i;\nprintf(\"Enter a word followed by a space and a number: \");\n// provide pointers to where to store the values (remember str is actually a pointer to the first element)\nint tokensRead = scanf(\"%s %d\", str, &i);\n\nprintf(\"%d tokens were read str: %s and i: %d\", tokensRead,str, i);\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/introduction.mdx#1", "metadata": {"Header 1": "Introduction to System Programming", "path": "../pages/digitalGarden/cs/c/systemProgramming/introduction.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/introduction.mdx#1", "page_content": "An operating system is a program that acts as an interface between the user and the computer hardware and controls the execution of all kinds of programs. Operating systems have kernels which are responsible for scheduling, starting and ending programs but also provide other functionalities like networking or file systems. \n \nCPUs run in two modes, [kernel and user modes](https://docs.microsoft.com/en-us/windows-hardware/drivers/gettingstarted/user-mode-and-kernel-mode) also well explained [here](https://blog.codinghorror.com/understanding-user-and-kernel-mode/). Only certain actions can be done in the kernel mode which is why there needs to be a way to interact between these two layers. This is what [system calls](https://www.ionos.com/digitalguide/server/know-how/what-are-system-calls/) are for. They allow a program to do things it can't do in the user mode like send information to the hardware etc. \n"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/introduction.mdx#2", "metadata": {"Header 1": "Introduction to System Programming", "Header 2": "POSIX", "path": "../pages/digitalGarden/cs/c/systemProgramming/introduction.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/introduction.mdx#2", "page_content": "POSIX stands for Portable Operating System Interface and is an API specification for system calls to maintain compatibility among operating systems. Therefore, any software that conforms to POSIX should be compatible with other operating systems that adhere to the POSIX standards. This is the reason, as to why most of the tools we use on Unix-like operating systems behave almost the same as it is pretty much POSIX compliant."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#1", "metadata": {"Header 1": "IPC with Pipes", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#1", "page_content": "Interprocess comminucation - IPC is the subject of coomunicatiing, echanging data and synchronizing between to processes. \nA kategorie of IPC are Datatransfers which use read and write system calls. The second category commincates via shared memory, without system calls and is therefore also faster. \nDatatransfers can be byte streams, message based or use special pseduoterminals. Important with all these mechanisms is that a read operation is destructive. Meaning if data has been read then it is no longer available to the others. Synchronization is done automaticallly. If there is no data available then read operation blocks. \nPipes, FIFOs, Stream sockets are unlimited byte streams which means that the number bytes does not matter. \nMessage queues and datagram sockets are messaged based. Each read operation reads one message exactly how it was written. It is not possible to only read a part of a message or multiple at once. \nShared Memory and memory mappings are fast but need to be synchronized. reading is however not destructive. Often semaphores are used."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#2", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "File locks", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#2", "page_content": "Work same as ReadWriteLock in Java.\ncoordiante file access. Read locks can be shared between multiples however if a process has a write lock then no other thread can have a read or write lock.\nflock and fcntl system calls???"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#3", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "Pipes", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#3", "page_content": "A pipe \"|\" is a form of redirection of standard output to some other destination that can be used in shells on Unix operating systems. So for example you can redirect the stdout of a command to a file or connect it to the stdin of another command. Pipes are unidirectional i.e data flows from left to right through the pipe. The command-line programs (right side) that do the further processing are referred to as filters. You can also use pipes programmatically in C for IPC. \nread blocks, if pipe is closed returns 0/EOF. \nJust a buffer in or file kernel memory with max capacity of 64KB. If a pipe is full write blocks until on the other end data is read. \npipe puts 2 file descriptors into the passed array the first (index 0) being the read end of the pipe and the other file descriptor for the write end. \nwhen finished with writing need to close it so read gets EOF and doesn't block indefinetly. If the pipe was closed on read side and the process still writes to the pipe the kernel sends a SIGPIPE signal. if the signal is ignored then write returns the error EPIPE. \n```c\n#include \n#include \n#include \n\nint main(void)\n{\nint fd[2]; // 0 = read, 1 = write end\npipe(fd);\n\nint id = fork();\n\nif (id == -1) { printf(\"Error when forking\"); return 1; }\nif (id == 0) // child process\n{\nclose(fd[0]); // close read end\nint x;\nprintf(\"CHILD: Input a number: \");\nscanf(\"%d\", &x);\nif (write(fd[1], &x, sizeof(x)) == -1) { printf(\"Error writing to pipe\"); return 3; }\nclose(fd[1]); // close write end when finished\n}\nelse // parent process\n{\nclose(fd[1]); // close write end\nint y;\nif (read(fd[0], &y, sizeof(y)) == -1) { printf(\"Error reading from pipe\"); return 3; }\nprintf(\"PARENT: You put in: %d\\n\", y);\nclose(fd[0]); // close read end when finished\n}\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nCHILD: Input a number: 4\nPARENT: You put in: 4\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#4", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "Pipes", "Header 3": "Bidirectional Pipes", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#4", "page_content": "There can be scenarios where you want bidirectional communication between two processes using pipes. This can't be done with just one pipe for this you need two pipes between the processes. \n```c\n#include \n#include \n#include \n\nint main(void)\n{\n/* Send a number from parent to child, child processes the number and sends back to parent. */\nconst int READ_END = 0; const int WRITE_END = 1;\nint childToParent[2];\nint parentToChild[2];\nif(pipe(childToParent) == -1) { printf(\"Error creating pipe\"); return 1; }\nif(pipe(parentToChild) == -1) { printf(\"Error creating pipe\"); return 1; }\n\nint id = fork();\nif (id == -1)\n{\nprintf(\"Error when forking\");\nreturn 2;\n}\nif (id == 0) // child process\n{\nclose(childToParent[READ_END]);\nclose(parentToChild[WRITE_END]);\nint input;\nif (read(parentToChild[READ_END], &input, sizeof(input)) == -1)\n{\nprintf(\"Error when reading from parent to child\");\nreturn 3;\n}\nprintf(\"CHILD: Received: %d\\n\", input);\nint output = input * 2; // input gets doubled\nif (write(childToParent[WRITE_END], &output, sizeof(output)) == -1)\n{\nprintf(\"Error when writing from child to parent\");\nreturn 4;\n}\nprintf(\"CHILD: Sent: %d\\n\", output);\n}\nelse // parent process\n{\nclose(parentToChild[READ_END]);\nclose(childToParent[WRITE_END]);\nprintf(\"PARENT: Input a number: \");\nint input;\nscanf(\"%d\", &input);\nif (write(parentToChild[WRITE_END], &input, sizeof(input)) == -1)\n{\nprintf(\"Error when writing from parent to child\");\nreturn 5;\n}\nprintf(\"PARENT: Sent: %d\\n\", input);\nint output;\nif (read(childToParent[READ_END], &output, sizeof(output)) == -1)\n{\nprintf(\"Error when reading from child to parent\");\nreturn 6;\n}\nprintf(\"PARENT: Received: %d\\n\", output);\nclose(parentToChild[WRITE_END]);\nclose(childToParent[READ_END]);\n}\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nPARENT: Input a number: 5\nPARENT: Sent: 5\nCHILD: Received: 5\nCHILD: Sent: 10\nPARENT: Received: 10\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#5", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "Pipes", "Header 3": "Simulating the \"|\" Pipe Operator", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#5", "page_content": "As you might have expected it is possible to simulate the pipe operator in the shell with pipes in C. \n```c\n#include \n#include \n#include \n#include \n\nint main(void)\n{\nconst int READ_END = 0;\nconst int WRITE_END = 1;\nint fd[2];\nif (pipe(fd) == -1)\n{\nprintf(\"Error creating pipe\");\nreturn 1;\n}\n\nint pingForkId = fork();\nif (pingForkId == -1)\n{\nprintf(\"Error when forking\");\nreturn 2;\n}\nif (pingForkId == 0) // child process\n{\ndup2(fd[WRITE_END], STDOUT_FILENO); // copies pipe write fd to stdout fd\nclose(fd[READ_END]);\nclose(fd[WRITE_END]); // can be closed as still a reference pointing to it\n\nexeclp(\"ping\", \"ping\", \"-c\", \"5\", \"google.com\", NULL); // current process code gets replaced by new process code. system() would create another child process.\n}\nint grepForkId = fork();\nif (grepForkId == -1)\n{\nprintf(\"Error when forking\");\nreturn 2;\n}\nif (grepForkId == 0) // child process\n{\ndup2(fd[READ_END], STDIN_FILENO); // copies pipe read fd to stdin fd\nclose(fd[READ_END]);\nclose(fd[WRITE_END]);\n\nexeclp(\"grep\", \"grep\", \"rtt\", NULL);\n}\n\nclose(fd[READ_END]);\nclose(fd[WRITE_END]);\n// Wait for children to terminate\nwaitpid(pingForkId, NULL, 0);\nwaitpid(grepForkId, NULL, 0);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#6", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "Pipes", "Header 3": "Synchronizing with Pipes", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#6", "page_content": "As mentioned above pipes don't need to transfer data, they can also just be used for synchronization between processes. \n```c\n#include \n#include \n#include \n\nint main(void)\n{\nint fd[2]; /* Process synchronization pipe */\n\nif (pipe(fd) == -1)\n{\nprintf(\"Error creating pipe\");\nreturn 1;\n}\n\nfor (int i = 0; i < 10; i++)\n{\nswitch (fork())\n{\ncase -1:\nprintf(\"Error when forking\");\nreturn 2;\ncase 0: // child process\nclose(fd[0]); // close read end\n\n// child does some work and lets parent know when finished\nfor (int j = 0; j < 100000000; j++)\n{\n}\nprintf(\"Finished work in Process %d\\n\", i);\n// notifies parent that done by decrementing file descriptor count\nclose(fd[1]);\nexit(EXIT_SUCCESS);\ndefault:\nbreak; // parent continue with loop\n}\n}\nprintf(\"Finished creating all children\\n\");\nclose(fd[1]); // parent doesn't use write end\nint dummy;\n// blocks till all are finished and receives EOF\nif (read(fd[0], &dummy, 1) != 0)\n{\n\nprintf(\"Parent didn't get EOF\");\nreturn 3;\n}\nprintf(\"All finished\");\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nFinished creating all children\nFinished work in Process 0\nFinished work in Process 1\nFinished work in Process 2\nFinished work in Process 4\nFinished work in Process 3\nFinished work in Process 5\nFinished work in Process 6\nFinished work in Process 7\nFinished work in Process 8\nFinished work in Process 9\nAll finished\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#7", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "FIFO files (Named Pipes)", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#7", "page_content": "pipes only work in the same hierarchy so between only between a parent and its child process or between children that share the same parent. We might want to be able to communicate between two processes that are not related. For this we have FIFOs which are a variation of pipes that work very similiarly to files, and also use a file which is why they are also often reffered to as named pipes. FIFOs work the same way as pipes so they also have unidrectional communication with \"first in first out\" semantic, hence the name. \nYou need to create the FIFO with the `int mkfifo(const char *filepath, mode_t mode);` function just like a file, hence it also taking the same mode parameter as when working with files. Opening the FIFO with the `open()` system call in read-only mode or write-only blocks until a second process opens the same FIFO in the other mode. So the two ends of the pipe need to exist. However, you can not open a FIFO with the `O_RDWR` mode. \n```c filename=\"fifo_read.c\"\n#include \n#include \n#include \n#include \n#include \n#include \n#include \n\nint main(void)\n{\nchar *fifoPath = \"myfifo\";\nif (mkfifo(fifoPath, S_IRUSR | S_IWUSR) == -1 && errno != EEXIST)\n{\nprintf(\"Error when creating FIFO file\\n\");\nreturn 1;\n}\n\nint fd = open(fifoPath, O_RDONLY); // blocks till write end is opened\nint output;\nif (read(fd, &output, sizeof(output)) == -1)\n{\nprintf(\"Error when reading from FIFO\");\nreturn 4;\n}\nprintf(\"PARENT: Received: %d\", output);\nremove(fifoPath);\nreturn 0;\n}\n``` \n```c filename=\"fifo_write.c\"\n#include \n#include \n#include \n#include \n#include \n#include \n#include "}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#8", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "FIFO files (Named Pipes)", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#8", "page_content": "int main(void)\n{\nchar *fifoPath = \"myfifo\";\nif (mkfifo(fifoPath, S_IRUSR | S_IWUSR) == -1 && errno != EEXIST)\n{\nprintf(\"Error when creating FIFO file\\n\");\nreturn 1;\n}\nint fd = open(fifoPath, O_WRONLY); // blocks till other end is opened\nint input = 10;\nif (write(fd, &input, sizeof(input)) == -1)\n{\nprintf(\"Error when writing to FIFO\");\nreturn 3;\n}\nprintf(\"CHILD: Sent: %d\", input);\nremove(fifoPath);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#9", "metadata": {"Header 1": "IPC with Pipes", "Header 2": "FIFO files (Named Pipes)", "Header 3": "Non-blocking FIFO", "path": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/ipcWithPipes.mdx#9", "page_content": "There might be cases where you don't want the open system call to block to avoid deadlocks. To avoid this you can add the `O_NONBLOCK` mode. Opening for read-only will succeed even if the write side hasn't been opened yet. However, opening for write only will return -1 and set `errno=ENXIO` unless the other end has already been opened. \nUsing `O_NONBLOCK` does have an influence on reading and writing. \nIf the buffer is empty then the read function returns -1 and sets `errno=EAGAIN`. If additionally the write end is already closed then EOF is returned. \nIf the read end is not ready yet and the write function is used and fills the buffer then write return -1 and sets `errno=EAGAIN`."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#1", "metadata": {"Header 1": "POSIX File I/O", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#1", "page_content": "The POSIX standard also defines functions/system calls for file I/O which are commonly found and used on unix systems. You have to remember that in unix everything is a file so these system calls aren't just used for text files but also multiple other things like devices, sockets etc."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#2", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "File descriptors", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#2", "page_content": "When a file is opened or created by a process the kernel assigns a position in an array unique to each process called the file descriptor. Each entry of this array contains a pointer to a file table which stores for each file, the file descriptor, file status flags, and offset. The file table does not itself contain the file, but instead has a pointer to another table, called the vnode table, which has vital information about the file, including its location in memory. \nThe file descriptors are unique to a process but the integers may by reused by another process without referring to the same file or location within a file. By convention the following are however always the same \n| File | File Descriptor | POSIX Symbolic Constant in \"unistd.h\" |\n| --------------- | --------------- | ------------------------------------- |\n| Standard Input | 0 | STDIN_FILENO |\n| Standard Output | 1 | STDOUT_FILENO |\n| Standard Error | 2 | STDERR | \n"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#3", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "System Calls", "Header 3": "Opening and closing", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#3", "page_content": "You can open a file with `int open(const char *pathname, int flags, mode_t mode` from `fcntl.h` which will return the smallest int that’s not used in by the current processor as the file descriptor. Once you are done with the file you can `int close(int fd)` it which will detach the use of the file descriptor for a process. When a process terminates any open file descriptors are automatically closed by the kernel. If something goes wrong with opening a file it will return -1 and you can find the error saved under errno, a global variable. \nPossible errors: \n| Constant | Description |\n| -------- | --------------------------------------------------------------------------------- |\n| EACCES | the requested access to the file or directory is not allowed |\n| EISDIR | file refers to a directory and the access requested involved writing |\n| EMFILE | the per-process limit on the number of open file descriptors has been reached |\n| ENFILE | the system-wide limit on the total number of open files has been reached. |\n| ENOENT | file not found and O_CREAT not speciefed |\n| EROFS | pathname refers to a file on a read-only filesystem andwrite access was requested |\n| ETXTBSY | is busy | \n#### Flags \nThe second argument consists of access, creation & status flags and is created by bitwise OR'ing ('|') the constants you want together. \nAccess: \n| Constant | Description |\n| -------- | ------------------------------------------------------------------------------------- |\n| O_RDONLY | open for reading only |\n| O_WRONLY | open for writing only |"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#4", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "System Calls", "Header 3": "Opening and closing", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#4", "page_content": "| O_RDONLY | open for reading only |\n| O_WRONLY | open for writing only |\n| O_RDWR | open for reading and writing |\n| O_APPEND | append on each write |\n| O_CREAT | create file if it does not exist: REQUIRES mode |\n| O_TRUNC | truncate size to 0 |\n| O_EXCL | is specified with O_CREAT, if already exists, then open() fails with the error EEXIST |\n| O_SYNC | | \n#### Modes \nIf you used the flag O_CREAT then you must specify with the mode the permissions of the created file by bitwise OR'ing ('|') the constants you want together. \n| Constant | Description |\n| -------- | ----------- |\n| S_IRUSR | User-read |\n| S_IWUSR | User-write |\n| S_IRGRP | Group-read |\n| S_IWGRP | Group-write |"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#5", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "System Calls", "Header 3": "Reading", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#5", "page_content": "The `ssize_t read(int fd, void *buf, size_t count)` function attempts to read up to count bytes from file descriptor fd into the buffer starting at buf. The read operation starts at the files offset, and increments it by the number of bytes read and returns the amount of bytes it read. If the file offset is at or past the end of file, no bytes are read, and read() returns zero."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#6", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "System Calls", "Header 3": "Writing", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#6", "page_content": "The `ssize_t write(int fd, const void *buf, size_t count)` function writes up to count bytes from the buffer starting at buf to the file referred to by the file descriptor fd. The number of bytes written may be less than count if, for example, there is insufficient space."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#7", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "System Calls", "Header 3": "Positioning", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#7", "page_content": "For each open file descriptor the kernel holds a file offset, the position where the next read or write will begin. With the `off_t lseek(int fd, off_t offset, int whence)` function you can change this offset. The value of whence must be one of the constants SEEK_SET, SEEK_CUR, or SEEK_END, to indicate whether the offset is relative to the beginning of the file, the current file position, or the end of the file."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#8", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "System Calls", "Header 3": "Truncating", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#8", "page_content": "With the `int ftruncate(int fd, off_t length)` function you can truncate the file to a size of precisely length bytes."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#9", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "System Calls", "Header 3": "Flushing", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#9", "page_content": "With the `int fsync(int fd)` function you can flush the buffers, or in other words synchronize the file states. This has the same effect as adding the O_SYNC flag when opening the file."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#10", "metadata": {"Header 1": "POSIX File I/O", "Header 2": "Example", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixFileIO.mdx#10", "page_content": "```c\n#include \n#include \n#include \n#include \n\nint main(void)\n{\nint fd = open(\"text.txt\", O_RDWR | O_CREAT | O_SYNC, S_IRUSR | S_IWUSR);\nchar *msg = \"Hello World\";\nint written = write(fd, msg, strlen(msg));\nprintf(\"wrote %d bytes to %d\\n\", written, fd);\n\nlseek(fd, -written, SEEK_CUR);\n\nchar read_msg[100];\nread(fd, read_msg, 100);\n\nprintf(\"read from file %d: \\\"%s\\\"\", fd, read_msg);\n\nreturn 0;\n}\n``` \n```bash filename=\"output\"\nwrote 11 bytes to 3\nread from file 3: \"Hello World\"\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#1", "metadata": {"Header 1": "POSIX Interprocess Communication", "Header 2": "POSIX Semaphores", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#1", "page_content": "You can find a detailed explanation of what a semaphore is [here](../../Concurrent%20Programming/9-synchronizers.md)."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#2", "metadata": {"Header 1": "POSIX Interprocess Communication", "Header 2": "POSIX Semaphores", "Header 3": "Named Semaphores", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#2", "page_content": "Named semaphores have a name and can be used by multiples process just like named pipes (FIFOs). The process that opens the semaphore but doesn't create it just needs to pass the first 2 arguments. Just like with mutexes there are in addition the functions `sem_trywait(sem_t *sem)` and `sem_timedwait(sem_t *sem, const struct timespec *abs_timeout);`. \n```c\n#include \n#include \n#include \n#include \n\nint main(void)\n{\nchar *name = \"/my_semaphore\"; // must start with \"/\"\"\nsem_t *sema = sem_open(name, O_CREAT, S_IRUSR | S_IRGRP, 2); // or/and O_EXCL\nsem_wait(sema);\n// sem_wait(sema); // blocks\nint current = 0;\nsem_getvalue(sema, ¤t);\nprintf(\"Decrease semaphore by 1, now: %d\\n\", current);\nsem_post(sema);\nsem_getvalue(sema, ¤t);\nprintf(\"Add semaphore by 1, now: %d\\n\", current);\nsem_close(sema);\nsem_unlink(name);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#3", "metadata": {"Header 1": "POSIX Interprocess Communication", "Header 2": "POSIX Semaphores", "Header 3": "Unnamed Semaphores", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#3", "page_content": "Unnamed semaphores work the same way as named ones but they are in memory and can be accessed by processes and threads via shared memory. Instead of opening one you need to initialize it with the `int sem_init(sem_t *sem, int pshared, unsigned int value);` function and when you are finished with it remove it with `int sem_destroy(sem_t *sem);`. The pshared argument indicates whether this semaphore is to be shared between the threads of a process, or between processes. If pshared has the value 0, then the semaphore is shared between the threads of a process, and should be located at some address that is visible to all threads. If pshared is nonzero, then the semaphore is shared between processes, and should be located in POSIX shared memory."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#4", "metadata": {"Header 1": "POSIX Interprocess Communication", "Header 2": "POSIX Shared Memory", "path": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/posixIPC.mdx#4", "page_content": "With POSIX shared memory you can have shared memory between processes without using files. \n/// TODOOOOOO an example that actually works/// \n```c\n#include \n#include \n#include \n#include \n#include \n#include \n\nint main(void)\n{\nchar* name = \"/my_shm1\";\nint data = 10;\n\nint fd = shm_open(name, O_CREAT | O_RDWR, S_IRUSR|S_IRGRP);\nif (fd == -1)\n{\nprintf(\"Failed to create shm object\");\nreturn 1;\n}\nftruncate(data, sizeof(int));\n// map shared memory to process address space\nvoid *addr = mmap(NULL, sizeof(int), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);\nif (addr == MAP_FAILED)\n{\nprintf(\"Failed to map shm object\");\nreturn 2;\n}\n\nint id = fork();\nif (id == -1)\n{\nprintf(\"Failed to fork\");\nreturn 3;\n}\nif (id == 0) // child process\n{\ndata = 15;\n}\nelse // parent process\n{\nwait(NULL); // wait for update to take effect\n}\nprintf(\"data is: %d\\n\", data);\nshm_unlink(name);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#1", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Processes", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#1", "page_content": "A program is a file containing instructions on how to create a process. Processes are instances of a running program. A program can have multiple processes and a process can be executing the same program. \nThe kernel sees a process as a piece of user-space memory with program code, constants and initial values for variables. The kernel also keeps track of processes by storing its Process ID (PID), its virtual memory space, open file descriptors and signal handlers amongst other things. \nThe PID is a positive integer and is used to identify a process in the system. The \"init\" process which is responsible for starting the unix operating system has the PID=1. Every process has a parent process apart from init so processes form a tree structure with init as its root. You can check this out with the `pstree` command. If a process dies it get's adopted by init so has the PID=1. \n`pid_t getpid(void)` of current process.\n`pid_t getppid(void)` of parent process. \n"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#2", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Processes", "Header 3": "Memory layout", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#2", "page_content": "The memory of each process is split into segments: program code, initialized data, none initialized data(bss), stack and the heap. \n \nUnix and many other operating systems use virtual memory for performance reasons. When using virtual memory only a so called Page is loaded into the RAM the rest is offloaded. Along with the above mentioned data structure the kernel also keep a so called page table for each process which maps the virtual memory address space to the Page frame in the physical memory, RAM. Address spaces not in use are not mapped so if a process tries to access them you receive a so called segmentation fault (SIGSEGV). \n"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#3", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Processes", "Header 3": "Stack and stack frames", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#3", "page_content": "Stack frames are parts of the stack which are allocated when a function is called for its arguments, local variables and CPU register copies of the external variables. If have worked with recursion you have maybe come across a \"Stackoverflow\" which can happen when the stack is full and no longer has any space."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#4", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Environment variables", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#4", "page_content": "You can create and access environment variables which are stored in the global variable `extern char **environ`. `char *getenv(const char *name)`, `int putenv(char *string)`"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#5", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Signals", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#5", "page_content": "Signals are forms of messages that go between processes. Most signals come from the kernel for example when there is input, an event or an exception occurred (division by 0 for example). A source process generates a signal until the destination process gets time from the scheduler the signal is pending. As soon as it is the process's turn the signal is delivered and it can just what to do. Either the process terminates, ignores the signal or handles it using a signal handler. Every signal has a symbol (in `signal.h`) and a number associated with the symbol for example an interrupt with CTRL+C causes the kernel to send a `SIGINT` signal. When referring to a signal you should always use its symbol as depending on the system it might have a different number. In most cases the IDs 1-31 are reserved for standard signals from the kernel and the next 32-64 are real-time signals. \nTODOOOO\nstrsignal, psignal"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#6", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Signals", "Header 3": "Register signal handlers", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#6", "page_content": "You can register signal handlers using the `sighandler_t signal(int signum, sighandler_t handler)` function meaning your handler will look something like this `void handle(int signal)`. The return value of signal is the previously registered handler if somethign went wrong it will return `SIG_ERR`. There are some default handlers which can be used like the following, there is also `SIG_IGN` to ignore the signal. Often times however handlers just set a global flag `volatile int flag;`. \n"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#7", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Signals", "Header 3": "Send signals", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#7", "page_content": "You can send signals to other processes with `int kill(pid_t pid, int sig)`. If you use the pid=0 the signal is sent to all processes in the same group. If you use pid=-1 the signal is send to all processes it can apart from init. If you use sig=0 you can check if a signal can be sent. \nWith `int raise(int sig)` you can send a signal to your own process which would be the same as `kill(getpid(), sig)`"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#8", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Signals", "Header 3": "Wait for signals", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#8", "page_content": "Often times you want a process to wait until it receives a signal which can be done with the `int pause(void)` function. A common thing to do after a pause would then be to check the global flags which might have been set by the signal handler."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#9", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Signals", "Header 3": "Signal masks", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#9", "page_content": "You can use masks to block certain signals from interupting a process. Signals stay pending until they are unblocked. A signal mask consists of sets of signals. You can crate a set with `int sigemptyset(sigset_t *set)` or `int sigfillset(sigset_t *set)`. You can then use `int sigaddset(const sigset_t *set, int sig)` or `int sigdelset(const sigset_t *set, int sig)` to either add or delete a signal. Once you have your signal set you can use it in the mask with `int sigprocmask(int how, const sigset_t *restrict set, sigset_t *restrict oldset)`. For the how you can use `SIG_BLOCK` which adds the signals from new to the mask, `SIG_UNBLOCK` to remove the signals from new or set it with `SIG_SETMASK`."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#10", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Process lifecycle", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#10", "page_content": "With the `pid_t fork(void)` function a process can create a child process. When this is done instead of copying the entire stack, heap etc. (which would be very bad for performance especially since most child process just start another program as you will see below) the parent and child have a read only page in memory. And then id something needs to be changed the kernel makes a copy-on-write. The `void exit(int status)` function terminates a process and sends the status to any process that is suspended as it is waiting for it with `pid_t wait(int *wstatus)`. If you want a parent to wait for all its children you can use `while(wait(NULL) != -1) {}`. \n![pageTableCopyOnWrite](/compSci/pageTableCopyOnWrite.png) \n```c\n#include \n#include \n#include \n#include \n\nint main()\n{\nint id = fork(); // returns child PID for parent and for child is unassigned so 0.\nif (id == 0)\n{\nprintf(\"hello from child\\n\");\nsleep(3);\nexit(EXIT_SUCCESS);\n}\nelse\n{\nprintf(\"hello from parent\\n\");\nint status;\nwait(&status);\nprintf(\"child has terminated with %d\\n\", status);\n}\n\nprintf(\"Bye\\n\"); // without exit child would still execute this\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#11", "metadata": {"Header 1": "Processes and Signals", "Header 2": "Process lifecycle", "Header 3": "Execve and system", "path": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/processesSignals.mdx#11", "page_content": "With `int execve(const char *pathname, char *const argv[], char *const envp[])` you throw away all the previous program code, stack, heap etc and can load a new program in it's place. This function does not return. \n![execve](/compSci/execve.png) \nWith `int system(const char *command)` you can similiarly create a child process and execute and shell command. The system function takes care of all the hidden details of forking and exiting etc."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#1", "metadata": {"Header 1": "Sockets", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#1", "page_content": "For a more in-depth explanation of how sockets work check out the [distributed systems section]()."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#2", "metadata": {"Header 1": "Sockets", "Header 2": "Sockets Interface", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#2", "page_content": "Sockets are like pipes an IPC mechanism between 2 processes that can either be on the same host or network. \nThe communication domain of the socket defines how the socket address looks and whether the communication is local or over the network. There are the following types of sockets domains: \n- `AF_UNIX` and `AF_LOCAL` for IPC on the same host.\n- `AF_INET` for IPv4 and `AF_INET6` for IPv6. \nThere are two types of sockets: \n- Stream Sockets, `SOCK_STREAM` which are reliable and bidirectional byte streams. Reliable meaning that bytes are received in the same order as they are sent and are guaranteed to arrive or an error is sent.\n- Datagram Sockets, `SOCK_DGRAM` are messaged based. They are not reliable and are connectionless. Meaning a connection between the client and server is not established and that the messages can be sent multiple times or lost and that the order of arrival is non-deterministic."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#3", "metadata": {"Header 1": "Sockets", "Header 2": "Sockets Interface", "Header 3": "System Calls", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#3", "page_content": "Just like all other IPC mechanisms, sockets use system calls to communicate: \n- `int socket(int domain, int type, int protocol);` creates an endpoint for communication and returns a file descriptor to that endpoint. By default, protocol can be 0.\n- `int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen);` \"Assigns a name to a socket\". Name being the addr and addrlen the size, in bytes, of the address structure\n- `int listen(int sockfd, int backlog);` marks the socket to be used to accept incoming connection requests.\n- `int accept(int sockfd, struct sockaddr *restrict addr, socklen_t *restrict addrlen);` extracts the first connection request from queue of pending connections for the listening socket. Creates and returns a new connected socket to be further used. The original socket is unaffected by this call.\n- `int connect(int sockfd, const struct sockaddr *addr, socklen_t addrlen);` connects the socket to the address specified by addr.\n- `int close(int fd);` to close the socket and connection. \nTo then read and write you can then use the same system calls as with pipes. \n \nWhen working with Datagram sockets you don't use the listen, accept and connect system calls because the sockets are connectionless. You also don't use the read and write system calls. Instead, you use the following: \n- `ssize_t recvfrom(int socket, void *restrict buffer, size_t length, int flags, struct sockaddr *restrict address, socklen_t *restrict address_len);`\n- `ssize_t sendto(int socket, const void *message, size_t length, int flags, const struct sockaddr *dest_addr, socklen_t dest_len);` \n"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#4", "metadata": {"Header 1": "Sockets", "Header 2": "Sockets Interface", "Header 3": "Addresses", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#4", "page_content": "When referring to addresses all functions take the `struct sockaddr` which is a generic structure for addresses and can parse the addresses with help of the family attribute (`AF_INET` or `AF_UNIX`). \n```c\nstruct sockaddr {\nunsigned short sa_family;\nchar sa_data[14];\n};\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#5", "metadata": {"Header 1": "Sockets", "Header 2": "Unix Domain Sockets", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#5", "page_content": "Unix domain sockets enable efficient communication between processes on the same host. Unix domain sockets support both stream-oriented sockets with the TCP protocol and datagram sockets with the UDP protocol (reliable compared to over the internet). \nThe address for Unix domain sockets is a file and can be specified with the structure below. When you bind a Unix domain socket a file is created at the specified path for the socket including permissions for the owner and group (to be able to connect and write you need write and execute access). When the socket is no longer required, you must manually delete the socket file. \n```c\nstruct sockaddr_un {\nunsigned short int sun_family; /*AF_UNIX*/\nchar sun_path[UNIX_PATH_MAX]; /*pathname*/\n};\n``` \nWhen repeatedly binding to the same path the errno `ADDRINUSE` is set. \nExample using UDP: \n```c filename=\"unix_server\"\n#include \n#include \n#include \n#include \n#include \n#include \n#include \n\n#define SOCK_PATH \"unix_sock_server\"\n\nint main(void)\n{\n\nint server_sock, err;\nstruct sockaddr_un server_sockaddr, client_sockaddr;\nmemset(&server_sockaddr, 0, sizeof(struct sockaddr_un));\nchar buf[256];\nmemset(buf, 0, 256);\n\nserver_sock = socket(AF_UNIX, SOCK_DGRAM, 0);\nif (server_sock == -1)\n{\nprintf(\"Failed to create server socket\");\nexit(1);\n}\n\nserver_sockaddr.sun_family = AF_UNIX;\nstrcpy(server_sockaddr.sun_path, SOCK_PATH);\nerr = bind(server_sock, (struct sockaddr *)&server_sockaddr, sizeof(server_sockaddr));\nif (err == -1)\n{\nprintf(\"Failed to bind server socket\");\nclose(server_sock);\nexit(1);\n}\n\nprintf(\"waiting to recvfrom...\\n\");\nint client_len;\nint bytes_read = recvfrom(server_sock, buf, 256, 0, (struct sockaddr *)&client_sockaddr, &client_len);\nif (bytes_read == -1)\n{\nprintf(\"Failed to read from client to server\");\nclose(server_sock);\nexit(1);\n}\nprintf(\"DATA RECEIVED = %s\\n\", buf);"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#6", "metadata": {"Header 1": "Sockets", "Header 2": "Unix Domain Sockets", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#6", "page_content": "close(server_sock);\nreturn 0;\n}\n``` \n```c filename=\"unix_client\"\n#include \n#include \n#include \n#include \n#include \n#include \n#include \n\n#define SERVER_PATH \"unix_sock_server\"\n\nint main(void)\n{\nint client_sock, err;\nstruct sockaddr_un server_sockaddr;\nmemset(&server_sockaddr, 0, sizeof(struct sockaddr_un));\nchar buf[256];\n\nclient_sock = socket(AF_UNIX, SOCK_DGRAM, 0);\nif (client_sock == -1)\n{\nprintf(\"Failed to create client socket\");\nexit(1);\n}\n\nserver_sockaddr.sun_family = AF_UNIX;\nstrcpy(server_sockaddr.sun_path, SERVER_PATH);\n\nstrcpy(buf, \"Hello from client\");\nprintf(\"Sending data...\\n\");\nerr = sendto(client_sock, buf, strlen(buf), 0, (struct sockaddr *)&server_sockaddr, sizeof(server_sockaddr));\nif (err == -1)\n{\nprintf(\"Failed to write from client to server\");\nclose(client_sock);\nexit(1);\n}\nprintf(\"Data sent!\\n\");\nclose(client_sock);\n\nreturn 0;\n}\n``` \nExample using TCP: \n```c filename=\"unix_server\"\n#include \n#include \n#include \n#include \n#include \n#include \n#include \n\n#define SOCK_PATH \"unix_sock_server\"\n\nint main(void)\n{\nint server_sock, client_sock, err;\nstruct sockaddr_un server_sockaddr;\nstruct sockaddr_un client_sockaddr;\nmemset(&server_sockaddr, 0, sizeof(struct sockaddr_un));\nmemset(&client_sockaddr, 0, sizeof(struct sockaddr_un));\nchar buf[256];\n\n// create socket\nserver_sock = socket(AF_UNIX, SOCK_STREAM, 0);\nif (server_sock == -1)\n{\nprintf(\"Failed to create server socket\");\nexit(1);\n}\n\n// create address\nserver_sockaddr.sun_family = AF_UNIX;\nstrcpy(server_sockaddr.sun_path, SOCK_PATH);\nerr = bind(server_sock, (struct sockaddr *)&server_sockaddr, sizeof(server_sockaddr));\nif (err == -1)\n{\nprintf(\"Failed to bind server socket\");\nclose(server_sock);\nexit(1);\n}\n\nerr = listen(server_sock, 1);\nif (err == -1)\n{\nprintf(\"Failed to listen on server socket\");\nclose(server_sock);\nexit(1);\n}\nprintf(\"socket listening...\\n\");"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#7", "metadata": {"Header 1": "Sockets", "Header 2": "Unix Domain Sockets", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#7", "page_content": "err = listen(server_sock, 1);\nif (err == -1)\n{\nprintf(\"Failed to listen on server socket\");\nclose(server_sock);\nexit(1);\n}\nprintf(\"socket listening...\\n\");\n\n// accept incoming connection, store client address\nint client_len;\nclient_sock = accept(server_sock, (struct sockaddr *)&client_sockaddr, &client_len);\nif (client_sock == -1)\n{\nprintf(\"Failed to accept client on server socket\");\nclose(server_sock);\nclose(client_sock);\nexit(1);\n}\n\nprintf(\"waiting to read...\\n\");\nint bytes_read = read(client_sock, buf, sizeof(buf));\nif (bytes_read == -1)\n{\nprintf(\"Failed to read from client to server\");\nclose(server_sock);\nclose(client_sock);\nexit(1);\n}\nprintf(\"DATA RECEIVED = %s\\n\", buf);\n\nmemset(buf, 0, 256); // empty buffer\nstrcpy(buf, \"Hello From Server\");\nprintf(\"Sending data...\\n\");\nerr = write(client_sock, buf, strlen(buf));\nif (err == -1)\n{\nprintf(\"Failed to write from server to client\");\nclose(server_sock);\nclose(client_sock);\nexit(1);\n}\nprintf(\"Data sent!\\n\");\n\nclose(server_sock);\nclose(client_sock);\nreturn 0;\n}\n``` \n```c filename=\"unix_client\"\n#include \n#include \n#include \n#include \n#include \n#include \n#include \n\n#define SERVER_PATH \"unix_sock_server\"\n#define CLIENT_PATH \"unix_sock_client\"\n\nint main(void)\n{\n\nint client_sock, err;\nstruct sockaddr_un server_sockaddr;\nstruct sockaddr_un client_sockaddr;\nmemset(&server_sockaddr, 0, sizeof(struct sockaddr_un));\nmemset(&client_sockaddr, 0, sizeof(struct sockaddr_un));\n\nchar buf[256];\n\nclient_sock = socket(AF_UNIX, SOCK_STREAM, 0);\nif (client_sock == -1)\n{\nprintf(\"Failed to create client socket\");\nexit(1);\n}\n\n// setup client address\nclient_sockaddr.sun_family = AF_UNIX;\nstrcpy(client_sockaddr.sun_path, CLIENT_PATH);\n// bind\nerr = bind(client_sock, (struct sockaddr *)&client_sockaddr, sizeof(client_sockaddr));\nif (err == -1)\n{\nprintf(\"Failed to bind client socket\");\nclose(client_sock);\nexit(1);\n}"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#8", "metadata": {"Header 1": "Sockets", "Header 2": "Unix Domain Sockets", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#8", "page_content": "// setup server address\nserver_sockaddr.sun_family = AF_UNIX;\nstrcpy(server_sockaddr.sun_path, SERVER_PATH);\n// connect to server\nerr = connect(client_sock, (struct sockaddr *)&server_sockaddr, sizeof(server_sockaddr));\nif (err == -1)\n{\nprintf(\"Failed to connect client to server\");\nclose(client_sock);\nexit(1);\n}\n\nstrcpy(buf, \"Hello from client\");\nerr = write(client_sock, buf, strlen(buf));\nif (err == -1)\n{\nprintf(\"Failed to write from client to server\");\nclose(client_sock);\nexit(1);\n}\nprintf(\"Data sent!\\n\");\n\n// read data from server\nprintf(\"Waiting to recieve data...\\n\");\nmemset(buf, 0, sizeof(buf)); // empty buffer\nerr = read(client_sock, buf, sizeof(buf));\nif (err == -1)\n{\nprintf(\"Failed to read from server to client\");\nclose(client_sock);\nexit(1);\n}\nprintf(\"DATA RECEIVED = %s\\n\", buf);\n\nclose(client_sock);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#9", "metadata": {"Header 1": "Sockets", "Header 2": "Unix Domain Sockets", "Header 3": "Socketpairs", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#9", "page_content": "You can create socketpairs which can then be used very similarly to pipes with the `int socketpair(int domain, int type, int protocol, int sv[2]);` function. \n```c\nvoid child(int socket) {\nconst char hello[] = \"hello parent, I am child\";\nwrite(socket, hello, sizeof(hello)); /* NB. this includes nul */\n/* go forth and do childish things with this end of the pipe */\n}\n\nvoid parent(int socket) {\n/* do parental things with this end, like reading the child's message */\nchar buf[1024];\nint n = read(socket, buf, sizeof(buf));\nprintf(\"parent received '%.*s'\\n\", n, buf);\n}\n\nvoid socketfork() {\nint fd[2];\nstatic const int parentsocket = 0;\nstatic const int childsocket = 1;\npid_t pid;\n\n/* 1. call socketpair ... */\nsocketpair(PF_LOCAL, SOCK_STREAM, 0, fd);\n\n/* 2. call fork ... */\npid = fork();\nif (pid == 0) { /* 2.1 if fork returned zero, you are the child */\nclose(fd[parentsocket]); /* Close the parent file descriptor */\nchild(fd[childsocket]);\n} else { /* 2.2 ... you are the parent */\nclose(fd[childsocket]); /* Close the child file descriptor */\nparent(fd[parentsocket]);\n}\nexit(0); /* do everything in the parent and child functions */\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#10", "metadata": {"Header 1": "Sockets", "Header 2": "Internet Sockets", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#10", "page_content": "Internet sockets work very similarly to the Unix domain sockets. Stream sockets are based on the TCP protocol. Datagram sockets are based on the UDP protocol. \n```c\nsvaddr.sin6_port = htons(PORT_NUM);\ninet_pton(AF_INET6, argv[1], &svaddr.sin6_addr)\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#11", "metadata": {"Header 1": "Sockets", "Header 2": "Internet Sockets", "Header 3": "Network Byte Order", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#11", "page_content": "Unfortunately, not all computers store the bytes for a multibyte value like an IP address in the same order. There are two ways to store this value: \n- Little Endian: Low-order byte is stored on the starting address (A) and higher order byte is stored on the next address (A + 1).\n- Big Endian: High-order byte is stored on the starting address (A) and lower order byte is stored on the next address (A + 1). \nNetwork byte order uses the big endian system. Library functions that work with IP addresses need to be converted to this system for which there are the following functions: \n- `uint32_t ntohl(uint32_t netlong);` Network to host.\n- `uint16_t ntohs(uint16_t netshort);` Network to host.\n- `uint32_t htonl(uint32_t hostlong);` Host to Network.\n- `uint16_t htons(uint16_t hostshort);` Host to Network."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#12", "metadata": {"Header 1": "Sockets", "Header 2": "Internet Sockets", "Header 3": "Address Structures", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#12", "page_content": "To store IP addresses there are the following structs: \n```c\n// IPv4\nstruct in_addr {\nuint32_t s_addr; // Network Byte Order\n};\nstruct sockaddr_in {\nsa_family_t sin_family; // AF_INET\nin_port_t sin_port; // Network Byte Order\nstruct in_addr sin_addr; // Internet Adresse\n};\n// IPv6\nstruct in6_addr {\nunsigned char s6_addr[16]; // IPv6 address\n};\nstruct sockaddr_in6 {\nsa_family_t sin6_family; // AF_INET6\nin_port_t sin6_port; // Port Nummer\nuint32_t sin6_flowinfo; // IPv6 Flow Info\nstruct in6_addr sin6_addr; // IPv6 Adresse\nuint32_t sin6_scope_id; // Scope ID\n};\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#13", "metadata": {"Header 1": "Sockets", "Header 2": "Internet Sockets", "Header 3": "Loopback and Wildcard Addresses", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#13", "page_content": "IPv4 Loopback 127.0.0.1 and Wildcard 0.0.0.0: INADDR_LOOPBACK, INADDR_ANY\nIPv6 Loopback (::1) and Wildcard (::): IN6ADDR_LOOPBACK_INIT, IN6ADDR_ANY_INIT"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#14", "metadata": {"Header 1": "Sockets", "Header 2": "Internet Sockets", "Header 3": "Converting Addresses", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#14", "page_content": "Converts dot notation to binary: \n`int inet_pton(int af, const char *restrict src, void *restrict dst);` converts the character string src into a network address structure in the af address family, then copies the network address structure to dst. The af argument must be either AF_INET or AF_INET6. dst is written in network byte order. \nConverts binary to dot notation: \n`const char *inet_ntop(int af, const void *restrict src, char *restrict dst, socklen_t size);` converts the network address structure src in the af address family into a character string. The resulting string is copied to the buffer pointed to by dst, which must be a non-null pointer. The caller specifies the number of bytes available in this buffer in the argument size"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#15", "metadata": {"Header 1": "Sockets", "Header 2": "Internet Sockets", "Header 3": "Host Lookup", "path": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/sockets.mdx#15", "page_content": "```c\nstruct addrinfo {\nint ai_flags;\nint ai_family;\nint ai_socktype;\nint ai_protocol;\nsocklen_t ai_addrlen;\nstruct sockaddr *ai_addr;\nchar *ai_canonname;\nstruct addrinfo *ai_next;\n};\n``` \n`int getaddrinfo(const char *restrict node, const char *restrict service, const struct addrinfo *restrict hints, struct addrinfo **restrict res);`\nGiven node and service, which identify an Internet host and a service, getaddrinfo() returns one or more addrinfo structures, each of which contains an Internet address. After use the struct should be freed again with `void freeaddrinfo(struct addrinfo *result)`."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#1", "metadata": {"Header 1": "Threads and Synchronization with C", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#1", "page_content": "Threads are similar to process and allow a program to do multiple things at once by having multiple threads in it. A key difference between threads and process is however that threads share the same global memory and just have their private stack for local variables and function calls and are therefore not as expensive as process which have the big overhead of creating an entire new memory space. This is why threads are also often called lightweight processes. \nExchanging information between process can be quiet tricky and costly because the parent and child don't share memory. However, threads share the following things between each other amongst other things \n- PID and Parent PID\n- Open file descriptors\n- Signal handlers\n- Global memory \nEach thread however does receive the following things for itself \n- Thread ID\n- Signal Mask\n- Errno variable\n- Stack \nTo work with threads you can use the Pthreads API which are also known as POSIX threads, sounds familiar... which is provided with the gcc compiler. Important to know here is that Pthreads functions don't return -1 on failure like many other functions in the standard library. Instead they return 0 on success and add the errno on failure. To be able to use the Pthreads API you need to pass the `-pthread` flag to the gcc compiler."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#2", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Creating Threads", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#2", "page_content": "To create a thread you will need to use the following function \n```c\nint pthread_create(pthread_t *restrict thread,\nconst pthread_attr_t *restrict attr,\nvoid *(*start_routine)(void *),\nvoid *restrict arg);\n``` \nThe first parameter is an integer that is used as an output parameter and is used to identify the thread in your operating system.\nThe second parameter is for specific attributes for the thread, by passing NULL you can use the default.\nThe third parameter is the function that the thread will execute once it is started.\nThe fourth parameter is used to pass arguments to the function and must be cast to a void pointer. If you want to pass multiple arguments, you would use a pointer to a struct. \n```c\nint pthread_join(pthread_t thread, void **retval);\n``` \nA call to the join function blocks the calling thread until the thread with ID as the first parameter is terminated. You can also store the return value of the thread with the second parameter. \nThreads can terminate in multiple ways \n- By calling `void pthread_exit(void *retval);`\n- By letting the thread function return.\n- By calling exit which will terminate the process including all its threads. \nInterestingly of the main thread calls pthread_exit all the other threads will continue to execute otherwise they all automatically terminate when main returns. \n```c\n#include \n#include \n\n#include \n#include \n\nvoid *foo()\n{\nprintf(\"foo ID: %ld\\n\", pthread_self());\npthread_exit(NULL);\n}\n\nint main(void)\n{\nprintf(\"main ID: %ld\\n\", pthread_self());\npthread_t foo_t;\npthread_create(&foo_t, NULL, foo, NULL);\n\npthread_join(foo_t, NULL);\nprintf(\"done\");\n\nreturn 0;\n}\n\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#3", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Passing Values", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#3", "page_content": "When creating the pthread you can pass the arguments using the fourth parameter. \n```c\n#include \n#include \n#include \n\ntypedef struct Point\n{\ndouble x;\ndouble y;\n} Point;\n\nvoid *printPoint(void *args)\n{\nPoint p = *((Point *)args);\nprintf(\"(%f, %f)\", p.x, p.y);\npthread_exit(NULL);\n}\nint main(void)\n{\npthread_t pid;\nPoint p = {2, 10};\n\npthread_create(&pid, NULL, printPoint, &p);\npthread_join(pid, NULL);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#4", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Returning Values", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#4", "page_content": "You can return values from a thread with the pthread_exit function. The values you return should be on the heap otherwise you will run into problems. \n```c\n#include \n#include \n#include \n\ntypedef struct Point\n{\ndouble x;\ndouble y;\n} Point;\n\nvoid *createPoint()\n{\nPoint *p = malloc(sizeof(Point));\np->x = 3;\np->y = 7;\npthread_exit((void *)p);\n}\nint main(void)\n{\npthread_t pid;\nPoint p;\nvoid *res;\npthread_create(&pid, NULL, createPoint, NULL);\npthread_join(pid, &res);\n\np = *((Point *)res);\nfree(res);\nres = NULL;\nprintf(\"(%f, %f)\", p.x, p.y);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#5", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Further Operations on Threads", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#5", "page_content": "There are further operations that you can use with threads for example: \n- `int pthread_equal(pthread_t t1, pthread_t t2);` which compares two threads to see whether they are the same.\n- `int pthread_detach(pthread_t thread);` by default a thread runs in joinable mode. A joinable thread will not release its resources even after termination until some other thread calls `pthread_join()` with its ID. A Detached thread automatically releases its allocated resources on exit. No other thread needs to join it. Therefore there is also no way to determine its return value.\n- `int pthread_cancel(pthread_t thread);` sends a cancellation request to the thread. Whether the target thread reacts to the cancellation request depends on its cancelability state and type.\n- `int pthread_kill(pthread_t thread, int sig);` sends the signal sig to thread."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#6", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Synchronization", "Header 3": "Mutex", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#6", "page_content": "Threads can have a mutual state which is useful but you need to be careful when accessing and changing this state. A critical section is a code block that uses a mutual variable and should only be executed atomically, so at once, by one thread, so that the result does not depend on the interleaving. A mutex/lock can guarantee this behavior to avoid race conditions. For more on this, there is an entire [section dedicated to concurrent programming](../../concurrentProgramming/locking). \nMutex variables are of the type `pthread_mutex_t` and need to be initialized before they are used with `pthrad_mutex_t m = PTHREAD_MUTEX_INITIALIZER;` or the `int pthread_mutex_init(pthread_mutex_t *restrict mutex, const pthread_mutexattr_t *restrict attr);` function. To then use the lock you can use the following functions: \n- `int pthread_mutex_lock(pthread_mutex_t *mutex);` Acquires the lock. If the lock is already in use then this function blocks till it can acquire the lock. If called repeatedly in the same thread that already has the lock a deadlock occurs.\n- `int pthread_mutex_unlock(pthread_mutex_t *mutex);` Releases the lock. If called in a thread that has already released a lock will return an error.\n- `int pthread_mutex_trylock(pthread_mutex_t *mutex);` Tries to acquire the lock. If it can't it does not block. Instead, it returns `EBUSY`.\n- `int pthread_mutex_timedlock(pthread_mutex_t *restrict mutex, const struct timespec *restrict abstime);` Tries to acquire the lock and waits for a maximum of abstime. If it couldn't get acquire the lock in the given time it returns `ETIMEDOUT`. \nWith the timespec struct looking like this: \n```c\nstruct timespec {\ntime_t tv_sec; // seconds\nlong tv_nsec; // nanoseconds\n};\n``` \nAn example of using locks: \n```c\n#include \n#include \n#include \n#include \n#include \n\npthread_t tid[2];\nint counter;\npthread_mutex_t lock; // lock object"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#7", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Synchronization", "Header 3": "Mutex", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#7", "page_content": "pthread_t tid[2];\nint counter;\npthread_mutex_t lock; // lock object\n\nvoid *incrementCounter()\n{\nfor (unsigned long i = 0; i < 1000; i++)\n{\npthread_mutex_lock(&lock);\ncounter++;\npthread_mutex_unlock(&lock);\n}\npthread_exit(NULL);\n}\n\nint main(void)\n{\nif (pthread_mutex_init(&lock, NULL) != 0) // init lock\n{\nprintf(\"\\n mutex init failed\\n\");\nreturn 1;\n}\nint i = 0;\nwhile (i < 2)\n{\nint err = pthread_create(&(tid[i]), NULL, &incrementCounter, NULL);\nif (err != 0)\nprintf(\"\\ncan't create thread :[%s]\", strerror(err));\ni++;\n}\n// wait for threads to finish\npthread_join(tid[0], NULL);\npthread_join(tid[1], NULL);\nprintf(\"Counter: %d\", counter);\npthread_mutex_destroy(&lock); // clean up lock\n\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#8", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Synchronization", "Header 3": "Condition Variables", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#8", "page_content": "In C we can also use [condition variables](../../concurrentProgramming/conditionVariables) to further synchronize concurrent programs. Condition variables in C work just like in Java. `pthread_cond_signal` is the equivalant of `notify()` and `pthread_cond_broadcast` of `notifyAll()`. \n```c\npthread_mutex_t lock; // init with PTHREAD_MUTEX_INITIALIZER\npthread_cond_t count_nonzero; // init with PTHREAD_COND_INITIALIZER\nunsigned int count;\n\ndecrementCount()\n{\npthread_mutex_lock(&lock);\nwhile (count == 0)\npthread_cond_wait(&count_nonzero, &lock);\ncount = count - 1;\npthread_mutex_unlock(&lock);\n}\n\nincrementCount(){\npthread_mutex_lock(&lock);\ncount++;\nif(count == 0){\npthread_cond_broadcast(&count_nonzero);\n// pthread_cond_signal(&count_nonzero)\n}\npthread_mutex_unlock(&lock);\n}\n````"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#9", "metadata": {"Header 1": "Threads and Synchronization with C", "Header 2": "Errno", "path": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/threads.mdx#9", "page_content": "To be able to set errno in a thread-safe manner we need to use a makro from the pthread library \n```c\n# define errno (*__errno_location())\n*__errno_location() = EBUSY;\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#1", "metadata": {"Header 1": "Time Measuring", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#1", "page_content": "In programs we care about two types of time: \n- Real-time or also known as calendar time is a fixed time from the calendar and is used for timestamps etc.\n- Process time is the amount of time a process takes which is used for measuring performance etc."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#2", "metadata": {"Header 1": "Time Measuring", "Header 2": "Calendar Time", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#2", "page_content": "Calendar time is always in UTC(Coordinated Universal Time)/GMT(Greenwich Mean Time) no matter in which timezone the program is run. In C and most other programming languages, time is handled internally as a signed integer which is based on [Unix/epoche/POSIX time](https://en.wikipedia.org/wiki/Unix_time). The value 0 is 01.01.1970 00:00, also commonly referred to as the birth time of Unix. The integer value of time is then the number of seconds before or after this time. So if the time integer value is 60 then it corresponds to 01.01.1970 00:01. \nThe function `time_t time(time_t *tloc);` returns the time as the number of seconds since Unix time. If tloc is non-NULL, the return value is also stored in tloc."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#3", "metadata": {"Header 1": "Time Measuring", "Header 2": "Calendar Time", "Header 3": "Getting and Setting System Time", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#3", "page_content": "With the following functions you can get and set the time as well as a timezone of the program: \n- `int gettimeofday(struct timeval *restrict tv, struct timezone *restrict tz);`\n- `int settimeofday(const struct timeval *tv, const struct timezone *tz);` \nWith the corresponding structs: \n```c\nstruct timeval {\ntime_t tv_sec; /* seconds */\nsuseconds_t tv_usec; /* microseconds */\n};\nstruct timezone {\nint tz_minuteswest; /* minutes west of Greenwich */\nint tz_dsttime; /* type of DST correction */\n};\n``` \nHowever, the timezone structure is obsolete and should therefore normally be specified as NULL."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#4", "metadata": {"Header 1": "Time Measuring", "Header 2": "Calendar Time", "Header 3": "Locale", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#4", "page_content": "Locale is a set of parameters that defines the user's language, region and how numbers, dates and times etc. should be represented. You can set the program's locale with the `char *setlocale(int category, const char *locale);`. The category could be `LC_ALL`, `LC_TIME`, `LC_NUMERIC` amongst others. If you pass LC_ALL and NULL you can read the current locale. \nSome possible locales could be: en_US, de_DE, de_CH etc."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#5", "metadata": {"Header 1": "Time Measuring", "Header 2": "Calendar Time", "Header 3": "Broken-Down Time", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#5", "page_content": "Calendar time represents absolute time as elapsed time since the epoch. This is convenient for computation but has no relation to the way people normally think of calendar time. By contrast, broken-down time is a binary representation of calendar time separated into year, month, day, and so on. Broken-down time values are not useful for calculations, but they are useful for printing human-readable time information. A broken-down time value is always relative to a choice of time zone. \n```c\nstruct tm {\nint tm_sec; // seconds [0..60]\nint tm_min; // minutes [0..59]\nint tm_hour; // hours [0..23]\nint tm_mday; // day of the month [1..31]\nint tm_mon; // month [0..11]\nint tm_year; // years since 1900\nint tm_wday; // weekday [0..6], Sunday = 0\nint tm_yday; // day of year [0..365]\nint tm_isdst; // daylight saving time flag\n}\n``` \nWith `struct tm *gmtime(const time_t *t);` you can convert a time_t to a broken-down time in UTC. Or you can use `struct tm *localtime(const time_t *t);` to convert a time_t to a broken-down time in local time. \nWith `time_t mktime(struct tm *timeptr);` you can convert broken-down time into time_t since the Epoch. tm_wday and tm_yday components of the structure are ignored"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#6", "metadata": {"Header 1": "Time Measuring", "Header 2": "Calendar Time", "Header 3": "Time and Strings", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#6", "page_content": "When not computing we want times to be in a human-readable form which is why there are lots of functions to convert times to strings: \n- `char *ctime(const time_t *timep);` returns a 26-byte string representing the time in locale and DST: \"Wed Jun 8 14:22:34 2011\".\n- `char *asctime(const struct tm *t);` does the same as `ctime()` without changing the timezone or DST. \nOften we also want to be able to format the string for this we can use `size_t strftime(char *restrict s, size_t max, const char *restrict format, const struct tm *restrict tm);` which formats the time and stores it in s. For example, \"%Y-%m-%dT%H:%M:%SZ\" becomes \"2018-12-29T12:17:25Z\" where Z is only if UTC. \nThere are also cases for example when getting input from users we want to parse a string to a time for this we can use `char *strptime(const char *restrict s, const char *restrict format, struct tm *restrict tm);` which parses the string s using the format to the time and stores it in tm. \nSome key formats being: \n| Format | Description |\n| ---------------------- | --------------------------------------- |\n| %a / %A | abbreviated / full weekday name |\n| %b / %B | abbreviated / full month name |\n| %d / %m / %w / %W / %u | day / month / weekday / week as decimal |\n| %y / %y | year with or without century |\n| %H / %M / %S | hour / minute / second as decimal |"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#7", "metadata": {"Header 1": "Time Measuring", "Header 2": "Calendar Time", "Header 3": "Timezones", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#7", "page_content": "Timezones define the time and region of a program. Timezones are stored in the environment variable `TZ` for Unix systems. To find out urs you can type: \nWith the `void tzset(void);` you can initialize the timezone which in turn sets the following global variables: \n- `extern char *tzname[2];` zone and DST zone.\n- `extern long timezone;` difference to UTC in seconds.\n- `extern int daylight;` non-null if DST."}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#8", "metadata": {"Header 1": "Time Measuring", "Header 2": "Process Time", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#8", "page_content": "Process time is the cpu time that a process has used since creation. It consists of two parts: User CPU time which is the amount of time spent in user mode, also commonly referred to as virtual time. And System cpu time which is the amount of time spent in kernel mode whilst doing for example system calls. \n`clock_t times(struct tms *t);` returns the number of clock ticks that have elapsed since an arbitrary point in the past and stores the current process times in the parameter.\n`clock_t clock(void);` returns an approximation of processor time used so far by the program. \n```c\nstruct tms {\nclock_t tms_utime; /* user time */\nclock_t tms_stime; /* system time */\nclock_t tms_cutime; /* user time of children */\nclock_t tms_cstime; /* system time of children */\n};\n``` \n```c\n#include \n#include \n#include \n#include \n\nint main() {\nlong tckPerSec = sysconf(_SC_CLK_TCK);\nlong clockPerSec = CLOCKS_PER_SEC;\nprintf(\"Ticks per second: %ld\\n\", tckPerSec);\nprintf(\"Clocks per second: %ld\\n\", clockPerSec);\nstruct tms sinceStart;\nsleep(10);\nclock_t clock = times(&sinceStart);\nprintf(\"Since start: User(%ld), System(%ld)\\n\", sinceStart.tms_utime, sinceStart.tms_stime);\nprintf(\"%ld\", clock/clockPerSec);\nreturn 0;\n}\n``` \n```c\n#include \n#include \n#include \n#include \n#include \n\nint main(int argc, char *argv[]) {\nstruct tms t0_buf;\nclock_t t0 = times(&t0_buf);\npid_t pid = fork();\nif (pid == 0) {\nexecvp(argv[1], argv + 1); // does not return\n}\nfflush(stdout);\nwait(NULL); // wait for child to exit\nstruct tms t1_buf;\nclock_t t1 = times(&t1_buf);\n\nlong ticks_per_s = sysconf(_SC_CLK_TCK);\nprintf(\"\\nreal \\t%ld\\nuser \\t%lfs\\nsys \\t%lfs\\n\",\n(t1 - t0) / ticks_per_s,\n(t1_buf.tms_cutime - t0_buf.tms_cutime) / (double) ticks_per_s,\n(t1_buf.tms_cstime - t0_buf.tms_cstime) / (double) ticks_per_s);\nreturn 0;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#9", "metadata": {"Header 1": "Time Measuring", "Header 2": "Process Time", "Header 3": "Unix Timers and Sleeping", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#9", "page_content": "With a timer, you can send notifications to a process after a certain time. Sleeping suspends the process or thread for a given time. \n#### Interval Timer \nWith `int setitimer(int which, const struct itimerval *restrict new_value, struct itimerval *restrict old_value);` that notifies in regular intervals. You can use the following which values: \n- `ITIMER_REAL` counts down in real-time and sends a SIGALRM signal.\n- `ITIMER_VIRTUAL` counts down in user CPU time and sends a SIGVTALRM signal.\n- `ITIMER_PROF` counts down in system CPU time and sends a SIGPROF signal. Can be used together with the one above. \n```c\nstruct itimerval {\nstruct timeval it_interval; /* Interval for periodic timer */\nstruct timeval it_value; /* Time until next notification */\n};\n\nstruct timeval {\ntime_t tv_sec; /* seconds */\nsuseconds_t tv_usec; /* microseconds */\n};\n``` \nYou can check the remaining time with `int getitimer(int which, struct itimerval *curr_value);`. \n```c\n#include \n#include \n#include \n#include \n\nvoid timer_handler(int signum)\n{\nstatic int count = 0;\nprintf(\"timer expired %d times\\n\", ++count);\n}\n\nint main()\n{\nstruct itimerval timer;\nsignal(SIGVTALRM, timer_handler);\n\n/* Configure the timer to expire after 250 msec... */\ntimer.it_value.tv_sec = 0;\ntimer.it_value.tv_usec = 250000;\n/* ... and every 250 msec after that. */\ntimer.it_interval.tv_sec = 0;\ntimer.it_interval.tv_usec = 250000;\n/* Start a virtual timer. It counts down whenever this process is\nexecuting. */\nsetitimer(ITIMER_VIRTUAL, &timer, NULL);"}}
+{"id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#10", "metadata": {"Header 1": "Time Measuring", "Header 2": "Process Time", "Header 3": "Unix Timers and Sleeping", "path": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx", "id": "../pages/digitalGarden/cs/c/systemProgramming/timeMeasuring.mdx#10", "page_content": "/* Do busy work. */\nwhile (1)\n;\n}\n``` \n#### One-time Timer - Alarm \nWith `unsigned int alarm(unsigned int seconds);` you can set up a one-time occurring timer. When the timer expires the `SIGALRM` signal is sent. An existing timer can be removed with `alarm(0);` \n#### Timer Precision \nDepending on CPU use the process might only start just after being notified. This however does not influence the next signal (no delaying). \n#### Suspend Processes - Sleeping \nYou can suspend a process with `unsigned int sleep(unsigned int seconds);`. Internally it is implemented with `int nanosleep(const struct timespec *req, struct timespec *rem);`."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#1", "metadata": {"Header 1": "Computer Systems", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#1", "page_content": "This Page is meant as a brief introduction to the types of computer systems there are and how they are made and the issues we have faced with making improvements to our computer systems over the years."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#2", "metadata": {"Header 1": "Computer Systems", "Header 2": "Types of Computer Systems", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#2", "page_content": "Let us first answer the question of what is a computer. What is the difference between a computer and just any other machine. A Computer is a programmable machine. Meaning that it can change its behavior and functionality, unlike other simple machines. \nMost commonly Computers are split up into the following types: \n- Personal Computer, short PC. The PC is the type of computer most people use and think of when talking about a computer. It serves a very general purpose and offers a wide variety of software to solve problems in our day-to-day life. In more recent years this type has also seen the addition of personal mobile devices (PMD) or more commonly known as smartphones and tablets. These devices are meant for the average consumer which makes subjects them to cost/performance tradeoffs.\n- Server computers or also just servers are computers that are usually accessed only over a network (Internet or LAN). Servers are built from the same basic technology as personal computers but with more performance and storage capabilities. Since they are also used by multiple people and are used to communicate between different applications and/or networks they have to be reliable to mitigate downtime.\n- Supercomputers, these computers represent the peak of what can be done with computers and are mainly used for research and academic purposes. You can find out more about the top supercomputers [here](https://www.top500.org/).\n- Embedded computers are the most used computers but people would never think so as they are usually hidden. They have a very wide range of applications and performances for example being part of your car to optimize fuel efficiency down to controlling the temperature in your coffee machine."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#3", "metadata": {"Header 1": "Computer Systems", "Header 2": "Components of a Computer", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#3", "page_content": "\nCPU = Control + Datapath\nMemory\nIO\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#4", "metadata": {"Header 1": "Computer Systems", "Header 2": "How are Chips made", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#4", "page_content": "blbabla silicon and moores law. yield etc. \n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#5", "metadata": {"Header 1": "Computer Systems", "Header 2": "The Power Wall", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#5", "page_content": "As transistors get smaller, their power density stays constant. power wall and denard scaling\ncant reduce voltage because of noise => bits getting flipped and cant cool \nlead to hift to Multicore Processors \nblabla amhdals law, cant infinetly speeedup there is some limit."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#6", "metadata": {"Header 1": "Computer Systems", "Header 2": "Programming a Computer", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx#6", "page_content": "blabla high level, compiler assembler instruciton sets"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/memoryHierarchy.mdx#1", "metadata": {"Header 1": "Memory Hierarchy", "path": "../pages/digitalGarden/cs/computerArchitecture/memoryHierarchy.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/memoryHierarchy.mdx#1", "page_content": "\nTo do about caches registers misses etc.\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#1", "metadata": {"Header 1": "Working with Numbers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#1", "page_content": "Working with numbers on computer systems is slightly more complex then one would think due to the fact that computers work only with the binary numbers 1 and 0."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#2", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Integers", "Header 3": "Unsigned Integers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#2", "page_content": "with n bits we can represent $2^n$ things. Encoding unsigned integers, i.e integers with no sign so positive numbers is pretty simple. The first bit called the LSB corresponds to $2^0$, the second one $2^1$. If that bit is set to 1 we add the value corresponding to that bit and receive the result. So if we have 32 bits we can represent $2^32$ things, so if we start at 0 we can represent the range from 0 to $2^32-1$. \nThis can also be described mathematically as followed if we denote our binary representation as $B=b_{n-1},b_{n-2},..,b_0$ and the function $D(B)$ which maps the Binary representation to its corresponding value. \n$$\nD(B)= \\sum_{i=0}^{n-1}{b_i \\cdot 2^i}\n$$"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#3", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#3", "page_content": "When we involve signed integers it gets a bit more complex since now we also want to deal with negative numbers. In history there have been a few representations for encoding signed integers which often get forgotten."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#4", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "Header 3": "Sign Magnitude", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#4", "page_content": "The idea for the sign and magnitude representation is a very simple one. You have a bit (the MSB) that represents the sign, 1 for negative, 0 for positive. All the other bits are the magnitude i.e. the value. \n$$\nD(B)= (-1)^{b_{n-1}} \\cdot \\sum_{i=0}^{n-2}{b_i \\cdot 2^i}\n$$ \n \n\n$$\n\\begin{align*}\n0000\\,1010_2 &= 10 \\\\\n1000\\,1010_2 &= -10\n\\end{align*}\n$$\n \nSeems pretty simple. However, there are two different representations for 0 which isn't good since computers often make comparisons with 0. This could potentially double the number of comparisons needed to be made which is one reason why this sign magnitude representation is not optimal."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#5", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "Header 3": "One's Complement", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#5", "page_content": "The idea of the one's complement is also very simple, it is that we want to quickly find the negative number of the same positive value by just flipping all the bits. In other words: \n$$\n-B=\\,\\sim B\n$$ \nAnd mathematically defined: \n$$\nD(B)= -b_{n-1}(2^{n-1}-1) + \\sum_{i=0}^{n-2}{b_i \\cdot 2^i}\n$$ \n \n\n$$\n\\begin{align*}\n0000\\,1010_2 &= 10 \\\\\n1111\\,0101_2 &= -10\n\\end{align*}\n$$\n \nhowever just like the sign magnitude representation the one's complement has the issue of having 2 representations for 0."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#6", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "Header 3": "Two's Complement", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#6", "page_content": "Finally, we have the representation that is used nowadays, the two's complement. This representation solves the issue of the double representation of 0 whilst still being able to quickly tell if a number is positive or negative. It does however lead to there not being a positive value corresponding to the lowest negative value. \n$$\nD(B)= -b_{n-1}(2^{n-1}) + \\sum_{i=0}^{n-2}{b_i \\cdot 2^i}\n$$ \n \n\n$$\n\\begin{align*}\n0000\\,1010_2 &= 10 \\\\\n1111\\,0110_2 &= -10\n\\end{align*}\n$$\n \nAny easy way to calculate the negative value of a given value with the two's complement representation is the following: \n$$\n\\sim B + 1 \\Leftrightarrow -B\n$$ \n#### Sign Extension \nWhen using the two's complement we do need be aware of something when converting a binary number with $n$ bits to a binary number with $n+k$ bits and it is called sign extension. Put simply for the value of binary number to stay the same we need to extend the sign bit. \n \n\n$$\n\\begin{align*}\n10:&\\, 0000\\,1010_2 \\Rightarrow 0000\\,0000\\,0000\\,1010_2 \\\\\n-10:&\\, 1111\\,0110_2 \\Rightarrow 1111\\,1111\\,1111\\,0110_2\n\\end{align*}\n$$\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#7", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#7", "page_content": "Representing real numbers can be pretty hard as you can imagine since real numbers can be infinite numbers such as $\\pi = 3.14159265358979323846264338327950288...$ but we only have finite resources and bits to represent them for example 4 or 8 bytes. Another problem is that often times when working with real numbers we find ourselves using very small or very large numbers such as $1$ Lightyear $=9'460'730'472'580.8\\,km$ or the radius of a hydrogen atom $0.000000000025\\,m$."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#8", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Binary Fractions", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#8", "page_content": "One way, but not a very good way to represent real numbers is to use binary fractions. Binary fractions are a way to extend the unsigned integer representation by adding a so-called binary/zero/decimal point. To the left of the binary point, we have just like with the unsigned representation the powers of 2. To the right, we now also use the powers of 2 with negative numbers to get the following structure: \n$$\nB = b_{i},b_{i-1},..,b_0\\,.\\,b_{-1},...,b_{-j+1},b_{-j}\n$$ \nAnd Formula: \n$$\nD(B) = \\sum_{k=-j}^{i}{b_k \\cdot 2^k}\n$$ \n \n\n$$\n\\begin{align*}\n5 \\frac{3}{4} &= 0101.1100_2 \\\\\n2 \\frac{7}{8} &= 0010.1110_2 \\\\\n\\frac{63}{64} &= 0.1111110_2\n\\end{align*}\n$$\n \nFrom the above examples we can make 3 key observations the first 2 might already know if you have been programming for a long time. \n- Dividing by powers of 2 can be done with shifting right $x / 2^y \\Leftrightarrow x >> y$\n- Multiply with powers of 2 can be done with shifting left $x \\cdot 2^y \\Leftrightarrow x << y$ \nThis representations does have its limits since we can only represent numbers of the form $\\frac{x}{s^k}$ other numbers such as $\\frac{1}{3}$ have repeating bit representations."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#9", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Fixed Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#9", "page_content": "The fixed-point representation or also called $p.q$ fixed-point representation extends the idea of binary fractions by adding a sign bit making the left part of the binary point the same as the two's complement. The right part is the same fractional part. The number of bits for the integer part (including the sign) bit corresponds to $p$ the number of bits for the fractional part corresponds to $q$, 17.14 being the most popular format. \n$$\nD(P)=-b_p \\cdot 2^p + \\sum_{k=-q}^{p-1}{b_k \\cdot 2^k}\n$$ \n \nThis representation has many pros, it is simple we can use simple arithmetic operations and don't need special floating-point hardware which is why it is commonly used in many low-cost embedded processors. The only con is that we can not represent a wide range of numbers which we will fix with the next and last representation."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#10", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#10", "page_content": "In 1985 the [IEEE standard 754](https://standards.ieee.org/ieee/754/993/) was released and quickly adapted as the standard for so-called floating-point arithmetic. In 1989, William Kahan, one of the primary architects even received the Turing Award, which is like the noble prize for computer science. The floating-point representation builds on the ideas of the fixed-point representation and [scientific notation](../../Mathematik/scientificNotation). \nFloating-point representation consists of 3 parts, the sign bit, and like the scientific notation an exponent and mantissa. \n \nWe most commonly use the following sizes for the exponent and mantissa: \n- Single precision: 8 bits for the exponent, 23 bits for the mantissa making a total of 32 bits with the sign bit.\n- Double precision: 11 bits for the exponent, 52 bits for the mantissa making a total of 64 bits. It doesn't offer much of a wider range then the single precision however, it does offer more precision, hence the name. \nIn 2008 the IEEE standard 754 was revised with the addition of the following sizes: \n- Half precision: 5 bits for the exponent, 10 bits for the mantissa making a total of 16 bits.\n- Quad precision: 15 bits for the exponent, 112 bits for the mantissa making a total of 32 bits. \nWith the rise of artificial intelligence and neural networks, smaller representations have gained popularity for quantization. This popularity introduced the following so-called minifloats consisting of 8 bits in total: \n- E4M3: as the name suggests 4 bits for the exponent and 3 bits for the mantissa.\n- E5M2: 5 bits for the exponent and 2 bits for the mantissa. \nThe brain floating point which was developed by Google Brain is also very popular for AI as it has the same range as single precision due to using the same amount of bits for the exponent but with less precision."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#11", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#11", "page_content": "The brain floating point which was developed by Google Brain is also very popular for AI as it has the same range as single precision due to using the same amount of bits for the exponent but with less precision. \nThe floating-point representation used however normalized values just like the scientific notation. Meaning the mantissa is normalized to the form of \n$$\n1.000010010...110_2\n$$ \nSo, in reality, we are not actually storing the mantissa but only the fraction part which is why it is also commonly referred to as the fraction. This leads to two things, we get an extra bit for free since we imply that the first bit is 1, but we can no longer represent the value 0. We will however see later how we can solve the problem of representing 0. \nWe also do not store the exponent using the two's complement. Instead, we use the so-called biased notation for the simple reason of wanting to compare values quickly with each other. To do this we want a form where the exponent with all zeros $0000\\,0000$ is smaller than the exponent with all ones $1111\\,1111$ which wouldn't be the case when using the two's complement. Instead, we use a bias. To calculate the bias we use the number of bits used to represent the exponent $k$. For single precision $k=8$, the bias for single precision is $127$ calculated using the formula: \n$$\nbias = 2^{k-1}-1\n$$ \n\nNow that we understand the form of the floating-point representation let us look at an example. We want to store the value $2022$ using single precision floating-point. First, we set the sign bit in this case $0$. Then we convert the value to a binary fraction. Then we normalize it whilst keeping track of the exponent. Then lastly we store the fraction part and the exponent + the bias. \n$$\n\\begin{align}\n2022 &= 11111100110._2 \\cdot 2^0 & \\text{Convert to binary fraction} \\\\\n&= 1.1111100110_2 \\cdot 2^{10} & \\text{Shift binary point to normalize} \\\\\nM &= 1.1111100110_2 & \\text{Mantissa} \\\\\nFraction &= 1111100110_2 & \\text{Fraction} \\\\"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#12", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#12", "page_content": "$$\n\\begin{align}\n2022 &= 11111100110._2 \\cdot 2^0 & \\text{Convert to binary fraction} \\\\\n&= 1.1111100110_2 \\cdot 2^{10} & \\text{Shift binary point to normalize} \\\\\nM &= 1.1111100110_2 & \\text{Mantissa} \\\\\nFraction &= 1111100110_2 & \\text{Fraction} \\\\\nE &= 10 & \\text{Exponent} \\\\\nExp &= E + bias = 10 + 127 = 1000\\,1001_2 & \\text{Biased Exponent}\n\\end{align}\n$$ \n| Sign | Exponent | Fraction |\n| ---- | --------- | ---------------------------- |\n| 0 | 1000 1001 | 1111 1001 1000 0000 0000 000 |\n \n#### Denormalized values \nAs mentioned above we can't represent the value $0$ using the normalized values. For this, we need to use denormalized values or also often called subnormal. For this, in the case of single precision, we reserve the exponent that consists of only zeros so has the biased value $0$ and therefore the exponent $1-bias$, for single precision this would be $-126$. If the fraction also consists of all zeros then we have a representation for the value $0$. If it is not zero then we just have evenly distributed values close to 0. \n\n| Value | Sign | Exponent | Fraction |\n| ------------------------------------------------- | ---- | --------- | ---------------------------- |\n| 0 | 0 | 0000 0000 | 0000 0000 0000 0000 0000 000 |\n| -0 | 1 | 0000 0000 | 0000 0000 0000 0000 0000 000 |\n| $0.5 \\cdot 2^{-126} \\approx 5.877 \\cdot 10^{-39}$ | 0 | 0000 0000 | 1000 0000 0000 0000 0000 000 |\n| $0.99999 \\cdot 2^{-126}$ | 0 | 0000 0000 | 1111 1111 1111 1111 1111 111 |\n \n#### Special Numbers"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#13", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#13", "page_content": "| $0.5 \\cdot 2^{-126} \\approx 5.877 \\cdot 10^{-39}$ | 0 | 0000 0000 | 1000 0000 0000 0000 0000 000 |\n| $0.99999 \\cdot 2^{-126}$ | 0 | 0000 0000 | 1111 1111 1111 1111 1111 111 |\n \n#### Special Numbers \nFor some cases we want to be able to store some special values such as $\\infty$ if we do $1.0 / 0.0$ or $NaN$ when doing $\\sqrt{-1}$ or $\\infty - \\infty$. Just like with solving the issue of representing $0$, to represent special values we can reserve an exponent, in the case of single precision this is the exponent consisting of only ones. If the fraction only consists of zeros then it represents the value $\\infty$ otherwise if the fraction is not all zeros it represents $NaN$. \n| Value | Sign | Exponent | Fraction |\n| --------- | ---- | --------- | ---------------------------- |\n| $\\infty$ | 0 | 1111 1111 | 0000 0000 0000 0000 0000 000 |\n| $-\\infty$ | 1 | 1111 1111 | 0000 0000 0000 0000 0000 000 |\n| $NaN$ | 0 | 1111 1111 | 1000 0000 0000 0000 0000 000 |\n| $NaN$ | 1 | 1111 1111 | 1111 1111 1111 1111 1111 111 | \nFor other representations such as the E4M3, E5M2 or bfloat16 the handling of special numbers can be different. This comes down to there being less bits and therefore each bit having more meaning so reserving an entire exponent range just to represent $NaN$ would be a big waste: \n|| E4M3 | E5M2 |\n| ------------------ | ---------------- | ------------------------ |\n| $-\\infty / \\infty$ | N/A | $S\\,11111\\,00_2$ |\n| $NaN$ | $S\\,1111\\,111_2$ | $S\\,11111\\,{01,10,11}_2$ |\n| $-0/0$ | $S\\,0000\\,000_2$ | $S\\,00000\\,00_2$ | \n#### Precision \nAs mentioned at the beginning of the floating-point section Real numbers are in theory infinite however we can not represent an infinite amount of numbers with a finite number of bits. Below you can see an estimated visualization of what values can actually be represented. \n \nAt a closer look, we can also see how the representations are distributed with the values close to zero being very precise. \n \nThis issue can however cause problems of imprecision if a certain number can not be represented and is rounded to the closest number that can be represented. For example in C we can do the following: \n```c\n#include \nint main ()\n{\ndouble d;\nd = 1.0 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1;\nprintf (“d = %.20f\\n”, d); // Not 2.0, outputs 2.00000000000000088818\n}\n``` \n#### Rounding \nThe IEEE standard 754 defines four rounding modes: \n- Round-up\n- Round-down\n- Round-toward-zero, i.e truncate, which is commonly done when converting from integer to floating point.\n- Round-to-even, the most common but also the most complicated of the four modes. \nI will not go into detail of the first three modes as they are self-explanatory. Let us first look at why we need to round-to-even. The reason is actually pretty simple, normal rounding is not very fair. \n\n$$\n\\begin{align*}\n& &0.5+1.5+2.5+3.5 &= 8 \\\\\n\\text{Rounded: }& &1+2+3+4 &= 10 \\\\\n\\text{Round-to-even: }& &0 + 2 + 2 + 4 &= 8\n\\end{align*}\n$$\n \n\nThis part is not correct.\n \nWhen working with round-to-even we need to keep track of 3 things: \n- Guard bit: The LSB that is still part of the fraction.\n- Round bit: The first bit that exceeds the fraction.\n- Sticky bit: A bitwise OR of all the remaining bits that exceed the fraction. \nSo if we only have a mantissa of 4 bits, i.e a fraction with 3 bits then it could look like this: \n \nNow we have 3 cases: \n- If $GRS=0xx$ we round down, i.e do nothing since the LSB is already $0$.\n- If $GRS=100$ this is a so-called tie, if the bit before the guard bit is $1$ we round the mantissa up otherwise we round down i.e set the guard bit to $0$"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#15", "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx#15", "page_content": "Now we have 3 cases: \n- If $GRS=0xx$ we round down, i.e do nothing since the LSB is already $0$.\n- If $GRS=100$ this is a so-called tie, if the bit before the guard bit is $1$ we round the mantissa up otherwise we round down i.e set the guard bit to $0$\n- For all other cases $GRS=110$, $GRS=101$ and $GRS=111$ we round up. \n\nAfter rounding, you might have to normalize and round again for example if we have $1.1111\\,1111|11$ with $GRS=111$ and Biased exponent $128$, i.e $2^1$. We have to round up and get $11.0000\\,0000$ therefore we need to increase the exponent by $1$ to normalize again. This also means that after rounding we can produce a over or underflow to infinity.\n \n#### Addition/Subtraction \n#### Multiplication"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#1", "metadata": {"Header 1": "Arithmetic and Logical Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#1", "page_content": "Arithmetic and logical operations are some of the key building blocks for writing any program as almost any functionality boils down to them."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#2", "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Arithmetic Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#2", "page_content": "In RISC-V all arithmetic operations have the same form, two sources (b and c) and one destination (a). Later on, we will learn more forms of operations and also how these operations are encoded to machine code, i.e to binary digits. \n```assembly\nadd x20, x21, x20\n``` \nThis is in aid of the first design principle of RISC-V \n> Simplicity favors regularity. \n\nIf we have the following C code \n```c\n// f in x19, g in x20, h in x21\n// i in x22, j in x23\nf = (g + h) – (i + j);\n``` \nand we compile it we can expect that the following RISC-V code will be assembled. \n```assembly\nadd x5, x20, x21\nadd x6, x22, x23\nsub x19, x5, x6\n``` \nHere we make use of the temporary registers `x5` and `x6`.\n \nWe will see what the immediate instructions are for further down. \n| Instruction | Type | Example | Meaning |\n| -------------------------------- | ---- | ---------------------- | ------------------------------------------- |\n| Add | R | `add rd, rs1, rs2` | `R[rd] = R[rs1] + R[rs2]` |\n| Subtract | R | `sub rd, rs1, rs2` | `R[rd] = R[rs1] – R[rs2]` |\n| Add immediate | I | `addi rd, rs1, imm12` | `R[rd] = R[rs1] + SignExt(imm12)` |\n| Set less than | R | `slt rd, rs1, rs2` | `R[rd] = (R[rs1] < R[rs2])? 1 : 0` |\n| Set less than immediate | I | `slti rd, rs1, imm12` | `R[rd] = (R[rs1] < SignExt(imm12))? 1 : 0` |\n| Set less than unsigned | R | `sltu rd, rs1, rs2` | `R[rd] = (R[rs1] Make the common case fast. \nImmediate operands are faster as they avoid a load instruction. However, due to the way instructions are encoded we can only use constants that use up to 12 bits. However, later on, we will see how we can work with larger constants. \n```assembly\naddi x22, x22, 4\n```"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#5", "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Logical Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#5", "page_content": "We also often find ourselves manipulating or working with bits which is what the logical operations are for. They are in most high-level programming languages the same with the most common exception being for the arithmetic or logical shift right operation. The key difference between these 2 is that the arithmetic version fills the left with zeros where as the logical version fills it with the sign bit resulting in a sign-extension to preserve the decimal value. \n| Instruction | Type | Example | Meaning |\n| -------------------------------- | ---- | --------------------- | --------------------------------------- |\n| AND | R | `and rd, rs1, rs2` | `R[rd] = R[rs1] & R[rs2]` |\n| OR | R | `or rd, rs1, rs2` | `R[rd] = R[rs1] | R[rs2]` |\n| XOR | R | `xor rd, rs1, rs2` | `R[rd] = R[rs1] ^ R[rs2]` |\n| AND immediate | I | `andi rd, rs1, imm12` | `R[rd] = R[rs1] & SignExt(imm12)` |\n| OR immediate | I | `ori rd, rs1, imm12` | `R[rd] = R[rs1] | SignExt(imm12)` |\n| XOR immediate | I | `xori rd, rs1, imm12` | `R[rd] = R[rs1] ^ SignExt(imm12)` |\n| Shift left logical | R | `sll rd, rs1, rs2` | `R[rd] = R[rs1] << R[rs2]` |\n| Shift right arithmetic | R | `sra rd, rs1, rs2` | `R[rd] = R[rs1] >> R[rs2] (arithmetic)` |\n| Shift right logical | R | `srl rd, rs1, rs2` | `R[rd] = R[rs1] >> R[rs2] (logical)` |\n| Shift left logical immediate | I | `slli rd, rs1, shamt` | `R[rd] = R[rs1] << shamt` |\n| Shift right logical immediate | I | `srli rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (logical` |\n| Shift right arithmetic immediate | I | `srai rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (arithmetic)` |"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#6", "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Logical Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#6", "page_content": "| Shift right logical immediate | I | `srli rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (logical` |\n| Shift right arithmetic immediate | I | `srai rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (arithmetic)` | \nIf we look at the RISC-V logical operations there isn't anything special apart from there not being a NOT operation. This is because it can be simply implemented by using the XOR operation which sets a bit to 1 if the bits are different and otherwise a 0. To be more precise we XOR with the value that only consists of positive bits to simulate a NOT operation. However, we will come across pseudo instructions where there will be a NOT operation."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#7", "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Operations With Large Constants", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#7", "page_content": "If we want to work with constants larger than 12 bits we need to do use the following instruction: \n```assembly\nlui x19, 0x003D0\n``` \nThis instruction stands for load upper immediate and allows us to load the 20 most significant bits into a registry. The 12 remaining bits will be set to 0 but we can also set these by either adding or using an OR operation. \n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#8", "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Assembly Optimization", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx#8", "page_content": "One of the main goals of a compiler is to optimize the program when writing the assembly code. \n```c\n// x in a0, y in a1\nint logical (int x, int y) {\nint t1 = x ^ y;\nint t2 = t1 >> 17;\nint mask = (1 << 8) – 7;\nint rval = t2 & mask;\nreturn rval;\n}\n``` \n```assembly\nxor a0, a0, a1 # a0 = x ^ y (t1)\nsrai a0, a0, 17 # a0 = t1 >> 17 (t2)\nandi a0, a0, 249 # a0 = t2 & ((1 << 8) – 7)\n``` \nIn the above example, we can see that a few simple optimizations have been made: \n- Because x is only needed once we can use its registry to store the result of the first line instead of having to use a separate temporary registry\n- The calculation of the mask only consists of constants, which means it can be calculated at runtime. This results in the last two statements being combined into one instruction. \n```c\n// x in a0, y in a1, z in a2\nint arith (int x, int y, int z) {\nint t1 = x + y;\nint t2 = z + t1;\nint t3 = x + 4;\nint t4 = y * 48;\nint t5 = t3 + t4;\nint rval = t2 - t5;\nreturn rval;\n}\n``` \n```assembly\nadd a5, a0, a1 # a5 = x + y (t1)\nadd a2, a5, a2 # a2 = t1 + z (t2)\naddi a0, a0, 4 # a0 = x + 4 (t3)\nslli a5, a1, 1 # a5 = y * 2\nadd a1, a5, a1 # a1 = a5 + y\nslli a5, a1, 4 # a5 = a1 * 16 (t4)\nadd a0, a0, a5 # a0 = t3 + t4 (t5)\nsub a0, a2, a0 # a0 = t2 – t5 (rval)\n``` \nIn this example the assembly code is actually longer then the C code. However, it has been optimzed, length of code does not correspond to efficiency. To be more precise the multiplication has been optimized because multiplicaitons are very slow. So instead of multiplying the compiler tries to make use of bit shifts which are much fast. So `y * 48` becomes `(3y) << 4`. Another example of this would be replacint `7 * x` with `8 * x - x` which can be translated to `(x << 3) - x`."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#1", "metadata": {"Header 1": "Control Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#1", "page_content": "When programming we often find ourselves using control structures like if and else this creates branches in our program where we either go down one or the other branch. RISC-V offers so-called branch instructions which in most cases take 2 operands and a Label to jump to after checking the condition. Labels are not some magic keywords, they are just an offset off the program counter, PC that is automatically handled by the assembler. \n| Instruction | Type | Example | Meaning |\n| ------------------------------------- | ---- | ---------------------- | ------------------------------------------------------- |\n| Branch equal | SB | `beq rs1, rs2, imm12` | `if (R[rs1] == R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch not equal | SB | `bne rs1, rs2, imm12` | `if (R[rs1] != R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch greater than or equal | SB | `bge rs1, rs2, imm12` | `if (R[rs1] >= R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch greater than or equal unsigned | SB | `bgeu rs1, rs2, imm12` | `if (R[rs1] >=u R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch less than | SB | `blt rs1, rs2, imm12` | `if (R[rs1] < R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch less than unsigned | SB | `bltu rs1, rs2, imm12` | `if (R[rs1] < u R[rs2]) pc = pc + SignExt(imm12 << 1)` | \nIn RISC-V you might notice that there is no greater then or less than or equal. This is because we can emulate these by just switching the operands, however, most CPUs have pseudo instructions to make the assembly code more readable. \n\n```c\n// i in x22, j in x23, f in x19, g in x20, h in x21\nif (i == j)\nf = g + h;\nelse\nf = g – h;\n```"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#2", "metadata": {"Header 1": "Control Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#2", "page_content": "\n```c\n// i in x22, j in x23, f in x19, g in x20, h in x21\nif (i == j)\nf = g + h;\nelse\nf = g – h;\n``` \nIn the code below we can also see a so-called unconditional branch meaning we always jump to the given Label. This unconditional branch makes us of the register `x0` always holding the value 0. \n```assembly\nbne x22, x23, L1\nadd x19, x20, x21\nbeq x0, x0, Exit # unconditional"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#3", "metadata": {"Header 1": "Control Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#3", "page_content": "L1:\nsub x19, x20, x21\nExit:\n```\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#4", "metadata": {"Header 1": "Control Transfer Operations", "Header 2": "Basic Blocks", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#4", "page_content": "A basic block is a small building block for a program. It is a sequence of instructions that has no branch calls except for at the end and no has no branch target apart from at the beginning. A goal of the compiler to make as many big basic blocks as it can as this is better for optimization and reusability. \n\nLet us compare to different assembler outputs for the same code and look at their basic blocks. Our code does the following: \n```c\nint fact_while (int x) {\nint result = 1;\nwhile (x > 1) {\nresult *= x;\nx = x – 1;\n}\nreturn result;\n}\n``` \nIt is common to rewrite loops as goto commands when trying to convert high-level code to assembler code. \n```c\nint fact_while (int x) {\nint result = 1;\nLoop:\nif (x <= 1) goto Exit;\nresult = result * x;\nx = x – 1;\ngoto Loop;\nExit:\nreturn result;\n}\n``` \n```assembly\nfact_while:\naddi a5, a0, 0 # a5 = x (x)\naddi a0, zero, 1 # a0 = 1 (result)\nLoop:\naddi a4, zero, 1 # a4 = 1\nble a5, a4, Exit # if (x <= 1) goto Exit\nmul a0, a0, a5 # result *= x\naddi a5, a5, -1 # x = x – 1\nbeq zero, zero, Loop # goto Loop\nExit:\n``` \nThe assembly code above has 3 small basic blocks but if we convert the C code to this structure we can decrease the amount of basic blocks and increase their size. \n```c\nint fact_while2 (int x) {\nint result = 1;\nif (x <= 1) goto Exit;\nLoop:\nresult = result * x;\nx = x – 1;\nif (x != 1) goto Loop;\nExit:\nreturn result;\n}\n``` \n```assembly\nfact_while2:\naddi a5, a0, 0 # a5 = x (x)\naddi a4, zero, 1 # a4 = 1\naddi a0, zero, 1 # a0 = 1 (result)\nble a5, a4, Exit # if (x <= 1) goto Exit\nLoop:\nmul a0, a0, a5 # result *= x\naddi a5, a5, -1 # x = x – 1\nbne a5, a4, Loop # if (x != 1) goto Loop\nExit:\n```\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#5", "metadata": {"Header 1": "Control Transfer Operations", "Header 2": "Target Adressing", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx#5", "page_content": "When jumping using the branch instructions most jumps are not very far. As mentioned before the label is like an immediate offset meaning it can be up to 12 bits long. If we want to jump further we can use one of the commands below. The `jal` instruction stands for jump and link, we jump using the passed offset which can now be 20 bits long. We also store the current PC i.e the return address into the corresponding `rd` register. If we want to jump even further then we can load a large immediate into a temporary register using the `lui` instruction and then add the remaining 12 bits and jump at the same time using the `jalr` instructions which also lets us read the offset from a register. \n| Instruction | Type | Example | Meaning |\n| ------------------------------------- | ---- | ---------------------- | ------------------------------------------------------- |\n| Jump and link | UJ | `jal rd, imm20` | `R[rd] = pc + 4; pc = pc + SignExt(imm20 << 1)` |\n| Jump and link register | I | `jalr rd, imm12(rs1)` | `R[rd] = pc + 4; pc = (R[rs1] + SignExt(imm12)) & (~1)` | \n\nWe can also use the `jal` instruction as an unconditional branch by using the zero register as the return address, which is the same as discarding it: \n```assembly\njal x0, Label\n```\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#1", "metadata": {"Header 1": "Data Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#1", "page_content": "Just using registries to store data is not enough which is why we also have main memory and secondary memory. Main memory is especially useful when working with composite data such as data structures or dynamic data. \nAs mentioned previously we can not directly work on data that is stored in memory, the CPU can only work on data that is in a registry. This leads us to load and store data between the registries and the main memory. \nEach byte in memory has an address. For composite data, RISC-V uses the little endian byte ordering meaning that the LSB byte is at the smallest address. \nRISC-V defines a word as data that consists of 32 bits this corresponds to the size of the registry and is the most common size to read and write to and from memory. However, we can also only read a byte which is useful since ASCII only uses a byte. RISC-V also supports reading a so-called halfword which corresponds to 16 bits which is useful when working with Unicode characters. \nWe do however need to keep in mind that in memory we only store the value, no context. So if we want a word to be handled like an unsigned integer we also need to specify that otherwise, it will treat it by default as a signed integer. \n| Instruction | Type | Example | Meaning |\n| ---------------------- | ---- | -------------------- | ------------------------------------------------ |\n| Load word | I | `lw rd, imm12(rs1)` | `R[rd] = Mem4[R[rs1] + SignExt(imm12)]` |\n| Load halfword | I | `lh rd, imm12(rs1)` | `R[rd] = SignExt(Mem2[R[rs1] + SignExt(imm12)])` |\n| Load byte | I | `lb rd, imm12(rs1)` | `R[rd] = SignExt(Mem1[R[rs1] + SignExt(imm12)])` |\n| Load word unsigned | I | `lwu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem4[R[rs1] + SignExt(imm12)])` |\n| Load halfword unsigned | I | `lhu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem2[R[rs1] + SignExt(imm12)])` |"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#2", "metadata": {"Header 1": "Data Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#2", "page_content": "| Load word unsigned | I | `lwu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem4[R[rs1] + SignExt(imm12)])` |\n| Load halfword unsigned | I | `lhu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem2[R[rs1] + SignExt(imm12)])` |\n| Load byte unsigned | I | `lbu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem1[R[rs1] + SignExt(imm12)])` |\n| Store word | S | `sw rs2, imm12(rs1)` | `Mem4[R[rs1] + SignExt(imm12)] = R[rs2]` |\n| Store halfword | S | `sh rs2, imm12(rs1)` | `Mem2[R[rs1] + SignExt(imm12)] = R[rs2](15:0)` |\n| Store byte | S | `sb rs2, imm12(rs1)` | `Mem1[R[rs1] + SignExt(imm12)] = R[rs2](7:0)` |"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#3", "metadata": {"Header 1": "Data Transfer Operations", "Header 2": "Loading With Pointers", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#3", "page_content": "Pointers in C are nothing else but memory addresses which means we can also load data from and to them. The most simple use of pointers is to swap to values: \n```c\n// x in a0, y in a1\nvoid swap(int *x, int *y)\n{\nint temp_x = *x;\nint temp_y = *y;\n*x = temp_y;\n*y = temp_x;\n}\n``` \nAnd as we can see we can use addresses stored in registries to load and write data: \n```assembly\nlw a4, 0(a0)\nlw a5, 0(a1)\nsw a5, 0(a0)\nsw a4, 0(a1)\n```"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#4", "metadata": {"Header 1": "Data Transfer Operations", "Header 2": "Loading Sequential Data", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx#4", "page_content": "When reading sequential data we do need to keep in mind that each address only corresponds to a byte. This leads us to make \"jumps\" of size 4. \n\n```c\n// h in x21, base address of A in x22\nA[9] = h + A[8]\n``` \n```assembly\nlw x9, 32(x22)\nadd x9, x21, x9\nsw x9, 46(x22)\n```\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx#1", "metadata": {"Header 1": "Procedure Calls", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx#1", "page_content": "A procedure or function is one of the key building blocks for programmers as it allows them to create understandable and reusable code. It also a way to add abstraction and simplify a program. In simple a procedure works as follows: \n1. Put the parameters in a place the procedure (callee) can access them.\n2. Transfer control to the procedure.\n3. Acquire the storage resources needed for the procedure.\n4. Perform the task.\n5. Put the result of the task in a place the caller can access them.\n6. Return control to the caller. \nFor the first and fifth point, we have the registers `x10-x17`. So that we know where to return to in step 6 we store the caller address in `x1`, this would be done when transferring control to the procedure with the `jal x1, Label` instruction."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx#2", "metadata": {"Header 1": "Procedure Calls", "Header 2": "Using More Registers", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx#2", "page_content": "But what if the 8 registers for the arguments are not enough to complete our task. We can use other registers as long as we clean up after ourselves, meaning we can spill registers to memory and then restore the registers before returning control. This leads us to the idea of the stack. In RISC-V we keep track of a stack pointer in `x2` and push and pop data to the stack. Important here is however that the stack grows from high to low addresses meaning when updating the stack pointer we need to subtract. \nFor this we also remember that the registers `x5-x7` and `x28-x31` are temporary registers and do not need to be restored before returning control but the registers `x8-x9` and `x18-x27` are saved registers and do need to be restored. \n\nIn the below example we could just use the temporary registers to store the temporary values but instead, we will spill some registers to the stack to demonstrate how this could be done. \n```c\n// g in x10, h in x11, i in x12, j in x13\nint leaf( int g, int h, int i, int j) {\nint f;\nf = (g + h) – (i + j);\nreturn f;\n}\n``` \n```assembly\nleaf:\naddi sp, sp, -12 # make space on stack for 12 bytes 3x 32 bits\nsw x5,8(sp) # save x5\nsw x6,4(sp) # save x6\nsw x7,0(sp) # save x7\nadd x5,x10,x11 # x5 <- g + h\nadd x6,x12,x13 # x6 <- i + j\nsub x7,x5,x6 # x7 <- x5 - x6\naddi x10,x7,0 # write result to x10 <- x7\nlw x7,0(sp) # restore x7\nlw x6,4(sp) # restore x6\nlw x5,8(sp) # restore x5\naddi sp,sp,12 # adjust stack\njalr x0,0(x1) # return to caller\n```\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx#3", "metadata": {"Header 1": "Procedure Calls", "Header 2": "Using More Registers", "Header 3": "Nested Procedures", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx#3", "page_content": "Procedures that do not call other procedures are called leaf procedures. But these are very rarely seen in programs, much more often we see nested procedures or even recursive procedures which need a lot of care when working with registers. For example, imagine the Procedure $A$ is called and the argument 3 is stored in `x10` and return address in `x1`. If $A$ then wants to call the procedure $B$ the argument in `x10` and return address in `x1` must be overwritten. So to prevent these collisions we must carefully push data to the stack and retrieve it again at a later time. \nTo aid this tricky task of keeping track of the local data of a procedure some RISC-V compilers use a frame pointer `fp` which is stored in the register `x8`. As the stack pointer can always change the frame pointer offers a stable base register for local memory references. \n \n\n```c\nint fact(int n)\n{\nif (n < 1)\nreturn 1;\nelse\nreturn n * fact(n-1);\n}\n``` \n```assembly\nfact:\naddi sp, sp, -8 # make space for 8 bytes\nsw x1, 4(sp) # save return address\nsw x10, 0(sp) # save n\naddi x11, x10, -1 # x11 <- n - 1\nbge x11, zero, L1 # if (x11 >= 0), goto L1\naddi x10, zero, 1 # x10 <- 1 (retval)\naddi sp, sp, 8 # adjust stack\njalr zero, 0(x1) # return\nL1:\naddi x10, x10, -1 # x10 <- n - 1\njal x1, fact # call fact(n-1)\naddi t1, x10, 0 # t1 <- fact(n-1)\nlw x10, 0(sp) # restore n\nlw x1, 4(sp) # restore return address\naddi sp, sp, 8 # adjust stack pointer\nmul x10, x10, t1 # x10 <- n * t1 (retval)\njalr zero, 0(x1) # return\n```\n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/pseudoInstructions.mdx#1", "metadata": {"Header 1": "Pseudo Instructions", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/pseudoInstructions.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/pseudoInstructions.mdx#1", "page_content": "As I have already mentioned multiple times, some RISC-V implementations also offer pseudo instructions which are like aliases for other instructions but make the assembly code easier to read and understand. \n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#1", "metadata": {"Header 1": "What is RISC-V?", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#1", "page_content": "RISC-V is an open standard instruction set architecture that has been developed at the University of California, Berkeley since 1981 and is based on the established RISC principles which we will see when diving deeper."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#2", "metadata": {"Header 1": "What is RISC-V?", "Header 2": "CISC vs RISC", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#2", "page_content": "Up until about 1986 most chip manufacturers were using the CISC (Complex Instruction Set Computers) Architecture, the most common example of this being the Intel x86 ISA which is widely used nowadays. However, they realized that it makes building the chips more complicated and slows down potential improvements. This brought on the switch to RISC (Reduced Instruction Set Computers) which focuses on having a small number of simple instructions and then letting the compilers resolve complexity. Some of the most common examples of RISC are MIPS, AMD ARM and the open-source RISC-V which is what we will be looking at. \nHowever, you might have realized that most computers that you interact with use x86, doesn't that mean that you aren't getting the best performance that you could? This is actually not true, almost all chips nowadays use the RISC architecture, including x86 chips. But I just said that x86 uses CISC? This is true for the early x86 chips, the x86 chips nowadays are hybrid chips. They support CISC instructions for backward compatibility as a lot of devices were already using CISC, however, inside the chips they convert the CISC instructions to RISC instructions and execute them, which makes them have a \"RISC\" core."}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#3", "metadata": {"Header 1": "What is RISC-V?", "Header 2": "Extensions", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#3", "page_content": "RISC-V aims to be as lightweight as possible which is why it allows for extensions to be added for certain functionalities. This allows chip manufacturers to only add what they need and not have instructions that they never intend to use or support. We will mainly be focusing on the `RV32IG` variant which is equivalent to `RV32IMAFD`. \n"}}
+{"id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#4", "metadata": {"Header 1": "What is RISC-V?", "Header 2": "Register Layout", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx", "id": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx#4", "page_content": "RISC-V has 32 (or 16 in the embedded variant) integer registers and with the floating-point extension another separate 32 floating-point registers. Since our focus is on the 32-bit variation each register can store 32 bits. These registers are essential to the CPU as it can only work with data that is in a register it can not work on data in main memory. So if we want to manipulate data that is in the main memory we need to first transfer the data from the main memory to a register. \nCertain registers have restrictions or should be used in a certain way. Most notable is that the first register will always store the value 0. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#1", "metadata": {"Header 1": "Actor Model", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#1", "page_content": "The actor model is a mechanism to deal with concurrent computation. An actor (autonomous concurrent object) is the primitive unit of computation. Actors are executed asynchronously of each other and have the following: \n- Private state, as actors there should be no shared mutable state between actors!\n- An Inbox that stores the received messages in FIFO order.\n- Behavior that is executed asynchronously when it receives a message from other actors. \nThe key concept is that actors are completely isolated from each other and they will never share memory. The only way they can share states is by exchanging messages with each other. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#2", "metadata": {"Header 1": "Actor Model", "Header 2": "Scala Akka", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#2", "page_content": "Actors in Scala are implemented with the akka library. Akka is very popular because it is very simple and self-explanatory but also because it has very high performance. It can send up to 50 million messages per second and 2.5 million actors take up 1 GB on the heap."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#3", "metadata": {"Header 1": "Actor Model", "Header 2": "Scala Akka", "Header 3": "Creating an Actor", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#3", "page_content": "To work with akka you need to create an actor system. This is the network/container for all the actors. To then create an Actor you need to extend the Actor class and implement the receive method, this is the method that will be called when a message is received. To then create a reference to the actor you need to instantiate it as part of the actor system with the `actorOf` method. The returned reference is immutable. You can only create actors this way if you try to instantiate an actor with new you will get an `akka.actor.ActorInitializationException`. Finally, you can send messages to the actor by using the `!` operator. \n```scala\nimport scala.language.postfixOps // required for `a ! msg` p\nimport akka.actor.{ActorSystem, Actor, ActorRef, Props}\n\nval as = ActorSystem(\"as\") // Actor infrastructure\nclass PrintActor extends Actor { // Actor definition\nvar nthRequest = 0 // Mutable state\ndef receive = { case msg => // Behavior\nnthRequest += 1\nprintln(s\"$nthRequest:$msg\")\n}\n}\nval printActor: ActorRef = as.actorOf(Props[PrintActor]) // Creates and starts actor\n\nprintActor ! \"Hello\"\nprintActor ! \"Bye\"\n``` \nIn most cases the actor is created with the default constructor then you can use `as.actorOf(Props[PrintActor])`. However, if you want to pass arguments to the constructor then you can do something like this: \n```scala\nclass HelloActor(myName: String) extends Actor {\ndef receive = {\ncase \"hello\" => println(s\"hello from $myName\")\ncase _ => println(s\"'huh?', said $myName\")\n}\n}\n\nobject Main extends App {\nval as = ActorSystem(\"HelloSystem\")\nval helloActor = as.actorOf(Props(new HelloActor(\"Fred\")), name = \"helloActor\")\n}\n``` \nYou can also create an actor as an anonymous subclass: \n```scala\nval print: ActorRef = as.actorOf(Props(\nnew Actor {\ndef receive = { case msg => println(msg) }\n}\n))\n``` \nAkka also supports hierarchies between actors for example you can have child actors for specific functionality: \n```scala\nval as = ActorSystem(\"as\")"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#4", "metadata": {"Header 1": "Actor Model", "Header 2": "Scala Akka", "Header 3": "Creating an Actor", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#4", "page_content": "class ChildActor() extends Actor {\ndef receive = {\ncase msg => println(\"I'm \" + self + \" : \" + msg) // self is like this in actor\n}\n}\n\nclass ParentActor extends Actor {\nval child = context.actorOf(Props[ChildActor], \"child\")\ndef receive = {\ncase msg: String =>\nchild ! \"Greets from dad\"\nprintln(msg)\n}\n}\n\nval p = as.actorOf(Props[ParentActor], \"parent\")\np ! \"Hi Kid\"\np ! \"Bye Kid\"\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#5", "metadata": {"Header 1": "Actor Model", "Header 2": "Scala Akka", "Header 3": "Sending Messages", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#5", "page_content": "Messages are sent asynchronously with the tell `!` operator. Messages are stored in the mailbox of the receiver and can be anything (type Any). Messages are sent with the guarantee of at-most-once delivery / no guaranteed delivery (send-and-pray) however if the actor system is local then it is as guaranteed as to when calling a method."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#6", "metadata": {"Header 1": "Actor Model", "Header 2": "Scala Akka", "Header 3": "Receiving Messages", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#6", "page_content": "The `receive` method specifies the initial behavior of an actor when it receives a message. The function uses pattern matching and is defined as `def receive: PartialFunction[Any,Unit]` meaning it is only defined for certain arguments. You can check if a function is defined for a given argument using the `isDefinedAt` method. \n```scala\nval pf: PartialFunction[Any,Unit] = {\ncase i: Int if i > 42 => println(\"huge\")\ncase s: String => println(s.reverse)\n}\npf.isDefinedAt(42) // false\npf.isDefinedAt(43) // true\n``` \nIf there is no match then there is `MatchError` and the message is published to the actor system's EventStream, which you can imagine as a dead letterbox. You can however add listeners to this letterbox and react to messages that end up there. \nTypically case classes (similar to Java records) are used as messages because they describe the vocabulary an actor understands (its API) and they are convenient for match expressions. You can also refine cases with so-called Guards as seen below. \n```scala\ncase class PrintMsg(msg: String)\ncase class ShoutMsg(msg: String)\nclass PrintActor extends Actor {\ndef receive = {\ncase PrintMsg(m) if m.contains(\"@\") => println(\"mail: \" + m)\ncase PrintMsg(m) => println(\"text: \" + m)\ncase ShoutMsg(m) => println(\"RECEIVED: \" + m.toUpperCase)\n}\n}\n``` \nMessage processing is scheduled on a thread pool this means that not every message of an actor is necessarily processed by the same thread. This also means we need some new guarantees: \n- The send of a message happens-before the receive of that message\n- Processing of one message happens-before processing the next message by the same actor"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#7", "metadata": {"Header 1": "Actor Model", "Header 2": "Scala Akka", "Header 3": "Advanced Messaging", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#7", "page_content": "You should use `self` (of type ActorRef) to refer to the current actor to safely pass it around. You can also use `sender` to refer to the actor that sent a message. \n```scala\nval as = ActorSystem(\"as\")\n\ncase class Msg(msg: String, sender: ActorRef)\n\nclass EchoActor extends Actor {\ndef receive = { case Msg(msg,client) => client ! msg}\n}\nval echoActor = as.actorOf(Props[EchoActor])\n\nclass Sender extends Actor {\nechoActor ! Msg(\"Hello\", self)\ndef receive = { case t => println(t) }\n}\n``` \nThe above example could be simplified to: \n```scala\nval as = ActorSystem(\"as\")\n\nclass EchoActor extends Actor {\ndef receive = { case msg => sender ! msg }\n}\nval echoActor = as.actorOf(Props[EchoActor])\n\nclass Sender extends Actor {\nechoActor ! \"Hello\"\ndef receive = { case t => println(t) }\n}\n``` \nUsing `context.setReceiveTimeout` you can set a timeout for inactivity (when no messages are sent). When the timeout is excited a ReceiveTimeout message is triggered. \n```scala\nclass TimeOutActor extends Actor {\ncontext.setReceiveTimeout(3.second)\ndef receive = {\ncase \"Tick\" => println(\"Tick\")\ncase ReceiveTimeout => println(\"TIMEOUT\")\n}\n}\n``` \n#### Ask Pattern \nAkka also supports Futures. So you can send a message and receive a future containing the answer of the actor. This is done using the ask `?` operator. \n```scala\nclass EchoActor extends Actor {\ndef receive = { case msg => sender ! msg }\n}\nval as = ActorSystem(\"as\")\nval echoActor = as.actorOf(Props[EchoActor])\nimplicit val timeout = Timeout(3.seconds) // consumed by '?'\nval futResult: Future[Any] = (echoActor ? \"Hello\") // completed with AskTimeoutException in case of timeout\n// OR\nval timeout = Timeout(3 seconds)\nval futResultString: Future[String] = (echoActor ? (\"Hello\")(timeout)).mapTo[String] // cast to specific type\n\nimport as.dispatcher // ExecutionContext required by Future#map, Future#onComplete etc.\nfutResultString.map(s => s.toUpperCase)\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#8", "metadata": {"Header 1": "Actor Model", "Header 2": "Finite State Machines", "path": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/actorModel.mdx#8", "page_content": "The actor model is ideal for representing finite state machines, as the states are the actors and the events can be represented with messages. For example, we can model a simple light switch: \n```mermaid\nstateDiagram-v2\nOn --> Off\nOff --> On\n``` \n```scala\ncase object On\ncase object Off\nclass Switch extends Actor {\nvar on = false\ndef receive = {\ncase On if !on => println(\"turned on\"); on = true\ncase Off if on => println(\"turned off\"); on = false\ncase _ => println(\"ignore\")\n}\n}\n``` \nAkka also offers so-called hot switching which enables us to swap the behavior of an actor at runtime. \n```scala\nclass Switch extends Actor {\nval offBehavior: PartialFunction[Any,Unit] = {\ncase On => println(\"turned on\"); context.become(onBehavior)\ncase _ => println(\"ignore\")\n}\nval onBehavior: PartialFunction[Any,Unit] = {\ncase Off => println(\"turned off\"); context.become(offBehavior)\ncase _ => println(\"ignore\")\n}\ndef receive = offBehavior // initial behavior\n}\n``` \nThere is also the last variation of this example where the actor remembers his previous behavior. So from the initial off behavior, the new on behavior is placed on top and then removed again so it can return to the initial off behavior \n```scala\nclass Switch extends Actor {\nval offBehavior: PartialFunction[Any,Unit] = {\ncase On => println(\"turned on\"); context.become(onBehavior, false)\ncase _ => println(\"ignore\")\n}\nval onBehavior: PartialFunction[Any,Unit] = {\ncase Off => println(\"turned off\"); context.unbecome()\ncase _ => println(\"ignore\")\n}\ndef receive = offBehavior\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx#1", "metadata": {"Header 1": "Condition Variables", "path": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx#1", "page_content": "Condition variables are another mechanism for synchronizing a program. Condition variables allow threads to enter the waiting state (stop running) until they are signaled/notified by another thread that some condition maybe have been fulfilled, and they can take over. The most common example used to illustrate this is a carpark. When the carpark is full you have to wait until a car drives out, and it is no longer full. Once this happens you want to be notified that the carpark is no longer full, so you can enter the carpark. \n```java\npublic class CarPark {\nprivate int spaces;\npublic CarPark(int spaces) { this.spaces = spaces; }\npublic synchronized void enter() {\nwhile(spaces == 0) {\ntry { this.wait(); } // wait and releases lock\ncatch (InterruptedException e) { }\n}\nspaces--;\n}\npublic synchronized void exit() {\nspaces++;\nthis.notifyAll(); // wakes up all threads for race to get the lock\n}\n}\n``` \nImportant is that the wait and notify/notifyAll functions are called on the lock object. Because every object can be a lock object the functions are implemented in the `Object` class. Another important thing to note is that when the wait function is called the lock is released so that other threads can still do work. When the thread acquires the lock again it continues from where it was waiting. \n\nMake sure to use a while loop, because of interrupts or spurious wakeups.\n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx#2", "metadata": {"Header 1": "Condition Variables", "Header 2": "Notify vs NotifyAll", "path": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx#2", "page_content": "The `notify()` function wakes up one waiting thread by random selection, which might still have to compete for the lock. If there are no threads waiting then the notify function is just like an empty statement. \nThe `notifyAll()` function wakes up all the waiting threads which then must compete for the lock. \nThere are two forms of waiters (waiting threads): \n- **Uniform waiters**: All waiters are equal (wait for the same condition)\n- **One-in, one-out**: A notification on the condition variable enables at most one thread to proceed \nWhen you are working with uniform waiters notify() is fine however it is much safer but less efficient to use notifyAll()."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx#3", "metadata": {"Header 1": "Condition Variables", "Header 2": "BlockingQueue", "path": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/conditionVariables.mdx#3", "page_content": "A blocking queue is a queue that blocks when you try to dequeue from it and the queue is empty, or if you try to enqueue items to it and the queue is already full. \n \nThere are a few ways to implement a thread-safe blocking queue. You can either use three locks, one for each condition and one for the synchronization: \n```java\npublic class Queue {\nprivate final static int SIZE = 10;\nprivate Object[] buf = new Object[SIZE];\nprivate int tail = 0, head = 0;\n\nprivate Object notEmpty = new Object();\nprivate Object notFull = new Object();\n\npublic synchronized Object dequeue() {\nwhile (tail == head) { // while empty\nsynchronized (notEmpty) {\ntry { notEmpty.wait(); } catch (Exception e) {}\n}\n}\nsynchronized (notFull) { notFull.notify(); }\nObject e = buf[head]; head = (head + 1) % SIZE;\nreturn e;\n}\npublic synchronized void enqueue(Object c) {\nwhile ((tail + 1) % SIZE == head) {\nsynchronized (notFull) {\ntry { notFull.wait(); } catch (Exception e) {}\n}\n}\nsynchronized (notEmpty) { notEmpty.notify(); }\nbuf[tail] = c;\ntail = (tail + 1) % SIZE;\n}\n}\n``` \nOr when working with the Lock interface we can add conditions to the lock: \n```java\npublic class Queue {\nprivate final static int SIZE = 10;\nprivate final Object[] buf = new Object[SIZE];\nprivate int tail = 0, head = 0;\n\nprivate final Lock lock = new ReentrantLock();\nprivate final Condition notEmpty = lock.newCondition();\nprivate final Condition notFull = lock.newCondition();\n\npublic Object dequeue() {\nlock.lock();\ntry {\nwhile (tail == head) { // while empty\ntry { notEmpty.await(); } catch (Exception e) {}\n}\nObject e = buf[head]; head = (head + 1) % SIZE;\nnotFull.signal(); return e;\n} finally { lock.unlock(); }\n}\npublic void enqueue(Object c) {\nlock.lock();\ntry {\nwhile ((tail + 1) % SIZE == head) {\ntry { notFull.await(); } catch (Exception e) {}\n}\nbuf[tail] = c; tail = (tail + 1) % SIZE;\nnotEmpty.signal();\n} finally {\nlock.unlock();\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#1", "metadata": {"Header 1": "The Executor Framework", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#1", "page_content": "The Java executor framework is used to run and manage Runnable objects, so-called Tasks. It does this using so-called workers or worker threads which are most often managed as part of a ThreadPool. Depending on the configuration of the pool instead of creating new threads every time the so-called channel will try and reuse already created threads. Any excess tasks flowing into the channel that the threads in the pool can't handle at the minute are held in some form of data structure like a BlockingQueue. Once one of the threads has finished its task and gets free, it picks up the next task from the channel. \n \nThe Executor interface provides a single function `void execute(Runnable task)` which executes the given task and depending on the implementation will do this using a thread pool or a single thread etc."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#2", "metadata": {"Header 1": "The Executor Framework", "Header 2": "Custom Executors", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#2", "page_content": "This is just a custom executor that uses a thread pool. \n```java\nclass MyThreadPoolExecutor implements Executor {\nprivate final BlockingQueue queue = new LinkedBlockingQueue();\n\npublic void execute(Runnable r) { queue.offer(r); }\n\npublic MyThreadPoolExecutor(int nrThreads) {\nfor (int i = 0; i < nrThreads; i++) { activate(); }\n}\n\nprivate void activate() {\nnew Thread(() -> {\ntry {\nwhile (true) { queue.take().run(); }\n} catch (InterruptedException e) { /* die */ }\n}).start();\n}\n}\n``` \nYou can also create an executor that just executes the given task on the current thread. \n```java\nclass DirectExecutor implements Executor {\npublic void execute(Runnable r) { r.run(); }\n}\n``` \nOr you can create an executor that creates a new thread for each task. \n```java\nclass ThreadPerTaskExecutor implements Executor {\npublic void execute(Runnable r) {\nnew Thread(r).start();\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#3", "metadata": {"Header 1": "The Executor Framework", "Header 2": "Builtin Executors", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#3", "page_content": "The executor framework has some built-in executors that you can access using the factory methods in the `Executors` class. All the factories return instances of the `ExecutorService` interface which extends the `Executor` interface and adds some life-cycle management methods. \n```java\ninterface ExecutorService extends Executor {\nvoid shutdown(); // kind, finish all pending tasks, don't accept new ones\nList shutdownNow(); // all running tasks are interrupted, a list of the tasks that were awaiting execution\nboolean isShutdown();\nboolean isTerminated();\nboolean awaitTermination(long timeout, TimeUnit unit) throws InterruptedException; // blocks until all tasks completed execution after a shutdown request\n}\n``` \n- `Executors.newFixedThreadPool(int nThreads)`: Creates a thread pool that reuses a fixed number of threads operating off a shared unbounded queue. Threads that die due to an exception are replaced.\n- `Executors.newCachedThreadPool()`: Creates a thread pool that creates new threads as needed, but will reuse previously constructed threads when they are available.\n- `Executors.newSingleThreadScheduledExecutor()`: Creates an Executor that uses a single worker thread operating off an unbounded queue. The worker thread is replaced if it dies due to an exception.\n- `Executors.newScheduledThreadPool(int corePoolSize)`: Creates a thread pool that can schedule commands to run after a given delay, or to execute periodically."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#4", "metadata": {"Header 1": "The Executor Framework", "Header 2": "Callable and Future", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#4", "page_content": "Because the runnable interface does not allow for exceptions or results we need to use something different if we wish to have this functionality. The executor framework has a few tools for this. We have the `Callable` interface which is our alternative for the `Runnable` interface and then the `Future` interface which is similiar to a promise in JavaScript and represents a future result of a task. \n```java\ninterface Callable {\nV call() throws Exception;\n}\n``` \n```java\ninterface Future {\nboolean cancel(boolean mayInterruptIfRunning);\nboolean isCancelled();\nboolean isDone();\nV get() throws InterruptedException, ExecutionException, CancellationException;\nV get(long timeout, TimeUnit unit) throws InterruptedException, ExecutionException, CancellationException, TimeoutException;\n}\n``` \nInstead of then using the execute function from the `Executor` interface, we have a few additional functions in the `ExecutorService` interface along with life-cycle methods. \n```java\ninterface ExecutorService extends Executor {\n// ...lifecycle methods\n Future submit(Callable task); // the key function\nFuture> submit(Runnable task);\n Future submit(Runnable task, T result);\n// takes a list of tasks and returns a list of the matching results\n List> invokeAll(Collection extends Callable> tasks) throws InterruptedException;\n// Executes the given tasks, returning the result of one that has completed successfully if any do.\n T invokeAny(Collection extends Callable> tasks) throws InterruptedException, ExecutionException;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#5", "metadata": {"Header 1": "The Executor Framework", "Header 2": "FactorialCalculator example", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#5", "page_content": "In this example, each task will return the factorial of a given number. \n```java\npublic class Main {\npublic static void main(String[] args) throws Exception {\nExecutorService executor = Executors.newFixedThreadPool(2);\n\nList> resultList = new ArrayList<>();\n\nfor (long i = 1; i <= 20; i++) {\nFuture result = executor.submit(new FactorialCalculator(i));\nresultList.add(result);\n}\n\nexecutor.shutdown();\nexecutor.awaitTermination(10, TimeUnit.SECONDS);\n\nfor (int i = 0; i < resultList.size(); i++) {\nFuture result = resultList.get(i);\nLong number = null;\nnumber = result.get(); // waits for next result\nSystem.out.println(i+\"::\\t\"+number);\n}\n\nexecutor.shutdown();\n}\n\nprivate static class FactorialCalculator implements Callable {\nprivate final Long number;\n\npublic FactorialCalculator(Long number) {\nthis.number = number;\n}\n\n@Override\npublic Long call() throws Exception {\nlong result = 1;\nif (number == 0 || number == 1) {\nresult = 1;\n} else {\nfor (int i = 2; i <= number; i++) {\nresult *= i;\n}\n}\nreturn result;\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#6", "metadata": {"Header 1": "The Executor Framework", "Header 2": "Fork-Join", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#6", "page_content": "If we try and implement a divide and conquer algorithm like the merge-sort with executors we run into some issues, especially because we create a lot of threads that are waiting and not doing anything. \n```java\npublic class MergeSortTask implements Runnable {\npublic final int[] elems, temp;\nprivate final int start, end;\n\nprivate final ExecutorService ex;\n\npublic MergeSortTask(int[] elems, ExecutorService ex) {\nthis.elems = elems;\nthis.start = 0;\nthis.end = elems.length;\nthis.temp = new int[end];\nthis.ex = ex;\n}\n\npublic MergeSortTask(int[] elems, int[] temp, int start, int end, ExecutorService es) {\nthis.elems = elems;\nthis.temp = temp;\nthis.start = start;\nthis.end = end;\nthis.ex = es;\n}\n\n@Override\npublic void run() {\nif (end - start <= 1) {\nreturn;\n} else {\nint mid = (start + end) / 2;\n\nMergeSortTask left = new MergeSortTask(elems, temp, start, mid, ex);\nMergeSortTask right = new MergeSortTask(elems, temp, mid, end, ex);\n\nFuture> lf = ex.submit(left);\nFuture> rf = ex.submit(right);\ntry {\n//print(\"Waiting for subtasks\");\nlf.get();\nrf.get();\n//print(\"Subtasks are ready\");\n} catch (Exception e) {\n}\nmerge(elems, temp, start, mid, end);\n}\n}\n\nprivate static void merge(int[] elem, int[] tmp, int leftPos, int rightPos, int rightEnd) {\nif (elem[rightPos - 1] <= elem[rightPos]) return;\n\nint leftEnd = rightPos;\nint tmpPos = leftPos;\nint numElements = rightEnd - leftPos;\n\nwhile (leftPos < leftEnd && rightPos < rightEnd)\nif (elem[leftPos] <= elem[rightPos])\ntmp[tmpPos++] = elem[leftPos++];\nelse\ntmp[tmpPos++] = elem[rightPos++];\n\nwhile (leftPos < leftEnd)\ntmp[tmpPos++] = elem[leftPos++];\n\nwhile (rightPos < rightEnd)\ntmp[tmpPos++] = elem[rightPos++];\n\nrightEnd--;\nfor (int i = 0; i < numElements; i++, rightEnd--)\nelem[rightEnd] = tmp[rightEnd];\n}\n\nprivate static int[] randomInts(int n) {\nint[] l = new int[n];\nRandom rnd = new Random();\n\nfor (int i = 0; i < l.length; i++) {\nl[i] = rnd.nextInt(1000);\n}\nreturn l;\n}"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#7", "metadata": {"Header 1": "The Executor Framework", "Header 2": "Fork-Join", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#7", "page_content": "private static int[] randomInts(int n) {\nint[] l = new int[n];\nRandom rnd = new Random();\n\nfor (int i = 0; i < l.length; i++) {\nl[i] = rnd.nextInt(1000);\n}\nreturn l;\n}\n\npublic static void main(String[] args) throws InterruptedException, ExecutionException {\nint SIZE = 4;\nint[] data = randomInts(SIZE);\n\nSystem.out.println(\"Unsorted: \" + Arrays.toString(data));\n\nExecutorService es = Executors.newCachedThreadPool();\n\nMergeSortTask ms = new MergeSortTask(data, es);\nFuture> f = es.submit(ms);\nf.get();\n\nes.shutdownNow();\nSystem.out.println(\"Sorted: \" + Arrays.toString(data));\n}\n}\n``` \nInstead it is better to do the work sequential after a certain threshold has been reached, this is the so-called sequential threshold. \n```java\n...\npublic void run() {\nif(r – l <= 1000) Arrays.sort(is); return;\nelse {\nint mid = (start + end) / 2;\n\nMergeSortTask left = new MergeSortTask(elems, temp, start, mid, ex);\nMergeSortTask right = new MergeSortTask(elems, temp, mid, end, ex);\n...\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#8", "metadata": {"Header 1": "The Executor Framework", "Header 2": "Fork-Join", "Header 3": "Fork-Join Framework", "path": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/executorFramework.mdx#8", "page_content": "For this reason, there is the fork-join framework which supports the methodology of forking work and then joining it together at the end. The Framework create a limited number of worker threads according to the CPU. Then each worker thread maintains a private double-ended work queue. When forking a worker pushes the new task to the head of its queue. When the worker is waiting or idle it pops a task off the head of its queue and executes it instead of sleeping. If a worker's queue is empty, it steals a task off the tail of another randomly chosen worker. \n \n```java\n// RecursiveAction has no result; RecursiveTask returns Result V\npublic class ForkJoinMergeSort extends RecursiveAction {\npublic final int[] is, tmp;\nprivate final int l, r;\n\npublic ForkJoinMergeSort(int[] is, int[] tmp, int l, int r) {\nthis.is = is; this.tmp = tmp; this.l = l; this.r = r;\n}\n\nprotected void compute() {\nif (r - l<= 100000) Arrays.sort(is, l, r);\nelse {\nint mid = (l + r) / 2;\nForkJoinMergeSort left = new ForkJoinMergeSort(is, tmp, l, mid);\nForkJoinMergeSort right = new ForkJoinMergeSort(is, tmp, mid, r);\nleft.fork();\nright.invoke();\nleft.join();\nmerge(is, tmp, l, mid, r);\n}\n}\nprivate void merge(int[ ] es, int[ ] tmp, int l, int m, int r) { ... }\n\nprivate static int[] randomInts(int n) { ... }\n\npublic static void main(String[] args) throws InterruptedException, ExecutionException {\nint SIZE = 4;\nint[] data = randomInts(SIZE);\nint[] tmp = new int[data.length];\n\nSystem.out.println(\"Unsorted: \" + Arrays.toString(data));\n\nForkJoinPool fjPool = new ForkJoinPool();\nForkJoinMergeSort ms = new ForkJoinMergeSort(data,tmp,0,data.length);\nfjPool.invoke(ms);\nfjPool.shutdown();\nSystem.out.println(\"Sorted: \" + Arrays.toString(data));\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#1", "metadata": {"Header 1": "Interrupts", "path": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#1", "page_content": "Blocking methods can potentially take forever if the condition they are waiting for never occurs which can lead to big issues. For this reason, we want a mechanism to be able to stop/cancel waiting for a given condition and continue with the program. \nIn Java, the static function `Thread.stop()` exists but is declared deprecated as it is unsafe. It is unsafe because the Thread that it is called on releases all the locks monitors the thread was holding. Any of the objects previously protected by these released locks which were in an inconsistent state become visible to the other threads and therefore potentially result in broken behavior. So we need to use a different mechanism."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#2", "metadata": {"Header 1": "Interrupts", "Header 2": "Interrupt Flag", "path": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#2", "page_content": "Internally every thread in Java has a boolean flag corresponding to its interrupted status. When the method `interrupt()` is called on a thread the flag is set to `true`. If the thread is blocked i.e. it is in an invocation of `wait()`, `sleep()` or `join()` the flag is consumed/reset and an `InterruptedException` is thrown. If the thread is not blocked the flag is just set and can be polled and handled by the developer. The flag can be read with the `isInterrupted()` method. There is also the static function `Thread.interrupted()` which resets the flag and returns the old value. Important to know is that if the flag is set any subsequent `wait()`, `sleep()` or `join()` on that thread will immediately throw an `InterruptedException`. \n```java\npublic static void main(String args[]) {\nThread.currentThread().interrupt(); // true\nSystem.out.println(Thread.interrupted()); // prints true, now false\ntry {\nThread.sleep(1000);\nSystem.out.println(\"ok1\"); // prints\n} catch (InterruptedException e) {\nSystem.out.println(\"IE: \" + Thread.currentThread().isInterrupted());\n}\nThread.currentThread().interrupt(); // true\nSystem.out.println(Thread.currentThread().isInterrupted()); // prints true\ntry {\nThread.sleep(1000);\nSystem.out.println(\"ok2\"); // doesn't print\n} catch (InterruptedException e) {\nSystem.out.println(\"IE: \" + Thread.currentThread().isInterrupted()); // prints false\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#3", "metadata": {"Header 1": "Interrupts", "Header 2": "Handling InterruptedException", "path": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#3", "page_content": "When an `InterruptedException` is thrown there are a few possible reactions all with their own benefits."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#4", "metadata": {"Header 1": "Interrupts", "Header 2": "Handling InterruptedException", "Header 3": "Ignore", "path": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#4", "page_content": "The exception can be ignored if we know for a fact that the threads interrupt method is never called for example when it is in a local non-accessible thread class. Another use case for ignoring the exception is if we want an essential service to not be interruptable. \n```java\ntry {\nwait();\n} catch (InterruptedException e) {\n//ignore\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#5", "metadata": {"Header 1": "Interrupts", "Header 2": "Handling InterruptedException", "Header 3": "Propagate", "path": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#5", "page_content": "The exception can also be propagated up the call stack. Some simple cleanup can also be done in the exception handler before propagating. \n```java\npublic synchronized void foo() throws InterruptedException {\n...\ntry {\nwait(); // wait until not full\n} catch ( InterruptedException e ) {\n/* some cleanup */\nthrow e;\n}\n...\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#6", "metadata": {"Header 1": "Interrupts", "Header 2": "Handling InterruptedException", "Header 3": "Defer", "path": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#6", "page_content": "In some cases, it might not be possible to propagate the exception for example when the task is defined in a Runnable. Instead, we defer the handling to later point. For this, we restore the interrupted status so that code higher up on the call stack can handle the exception appropriately. \n```java\ntry {\nwait();\n} catch (InterruptedException e) {\n// Restore the interrupted status\nThread.currentThread().interrupt();\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#7", "metadata": {"Header 1": "Interrupts", "Header 2": "Lost Wake-Up/Signal Problem", "path": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/interrupts.mdx#7", "page_content": "A lost wake-up or signal is an event that can happen when a thread is notified with a `notify()` call and is simultaneously interrupted. This results in the notify signal getting lost and possibly leading to a deadlock as the rest of the code thinks the notify was executed without any issues. A possible scenario could look like this: \n1. Threads t1 and t2 are waiting in a wait()\n2. Thread t3 performs a notify => t1 is selected\n3. Thread t4 interrupts t1\n1. wait called by t1 throws InterruptedException\n1. t1 does not process notification\n2. t2 does not wake up => Deadlock \nA solution to this problem is to call `notifyAll()` or `notify()` in the ExceptionHandler. \n```java\ncatch (InterruptedException e) {\nnotify();\nthrow e;\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx#1", "metadata": {"Header 1": "Introduction to Concurrent Programming", "Header 2": "Moore's Law", "path": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx#1", "page_content": "Moore's law isn't really a law but rather an observation by Gordon Moore in 1965 that the number of transistors on a\nmicrochip/CPU doubles about every year which would lead to exponential growth. \n \nWe can also observe this but there is a reason why people say that Moore's law is dead. Apart from the transistor\namount, everything else has slowly leveled out, meaning we can't get much more out of our CPUs, instead we can have\nmulticore CPUs which will have no impact on most current applications as they do not make use of concurrent programming.\nBut if they would, they could gain a massive speedup. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx#2", "metadata": {"Header 1": "Introduction to Concurrent Programming", "Header 2": "Amdahl's Law", "path": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx#2", "page_content": "Amdahl's law is a formula to predict the maximum speedup using a multicore CPU with $N$ processors/cores based on the\nproportion of parallelizable components of a program $p$ and the serial components $1-p$: \n$$\nspeedup \\leq \\frac{1}{(1-p)+ \\frac{p}{N}}\n$$ \n \nWhen looking at the potential speedup depending on the proportion of parallelizable components and processors we can see\nafter a certain point around the 64 mark the gain becomes very little. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx#3", "metadata": {"Header 1": "Introduction to Concurrent Programming", "Header 2": "Concurrent Programming", "path": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/introduction.mdx#3", "page_content": "When talking about programs there are three main subsets: serial programs, concurrent programs and parallel programs. \n \nConcurrent programs have multiple logical threads whereas serial programs just have one. To solve a problem concurrently\nneed to handle events that could happen at the same time. Because of this concurrent programs are often\nnon-deterministic, meaning results depend on the timing of events. Parallel programs compute components simultaneously\nso in parallel. To solve a problem with parallelism you need to break the problem down into pieces that can be done in\nparallel. Concurrent programs aren't necessarily parallel but being concurrent is a precondition for a parallel program. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#1", "metadata": {"Header 1": "JMM - Java Memory Model", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#1", "page_content": "The Java Memory Model (JMM) specifies guarantees that are given by the Java Virtual Machine (JVM) relating to concurrency: \n- When writing operations on variables become visible to other threads\n- Which operations are atomic\n- Ordering of operations, meaning under which circumstances can the effects of operations appear out of order to any given thread."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#2", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Memory layout", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#2", "page_content": "Modern CPUs don't just work with the main memory (RAM) they also have multiple layers of caches and registers to perform more efficiently. You can see [on this page](https://gist.github.com/hellerbarde/2843375) why it is worth having these caches and the difference in the time it takes to read depending on how far down the CPU has to reach for the data. However, this means that there can be multiple versions of the same data on different levels which can lead to issues. Additionally, as we know all threads share the main memory however each core and therefore thread has its own cache levels so there can be inconsistency inside a thread but also between threads. \n \nTo illustrate this we have the program below. When running the program we expect to see the values (1,0), (1,1) and (0,1) for all 6 possible interleavings. \n \nHowever, when running the program we also get (0,0). This is due to either compiler reordering or caching. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#3", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Happens before rules", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#3", "page_content": "The JMM defines a relationship called happens-before on actions such as reading/writing to variables, locking/releasing monitors and starting/joining threads. These happens-before relationships guarantee that a thread executing action A can see the results of action B on the same or a different thread. If there is no such relationship then there is no guarantee!"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#4", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Happens before rules", "Header 3": "Rule 1", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#4", "page_content": "Each action in a thread happens-before every action in that thread that comes later in the program order. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#5", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Happens before rules", "Header 3": "Rule 2", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#5", "page_content": "Releasing a lock happens-before every subsequent lock on the same lock. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#6", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Happens before rules", "Header 3": "Rule 3", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#6", "page_content": "A write to a volatile field happens-before every subsequent read of the same field. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#7", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Happens before rules", "Header 3": "Rule 4", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#7", "page_content": "A call to start a thread with `start()` happens-before every subsequent action in the started thread. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#8", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Happens before rules", "Header 3": "Rule 5", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#8", "page_content": "Actions in a thread `t1` happens-before another thread detects the termination of thread `t1`. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#9", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Happens before rules", "Header 3": "Rule 6", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#9", "page_content": "The happens-before order is transitive. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#10", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Volatile", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#10", "page_content": "Volatile fields guarantee the visibility of writes (i.e. volatile variables are never cached). Read access to a volatile field implies getting fresh values from memory (slower). Write access to a volatile field forces the thread to flush all pending writes to the memory level. Volatile variables have a cost due to these things having to be done and caching no longer being allowed. Important to note is also that access to a volatile variable inside a loop can be more expensive than synchronizing the entire loop. \n```java\nclass MyExchanger {\nprivate volatile Pair data = null;\npublic String getPairAsString() {\nreturn data == null ? null : data.toString();\n}\npublic boolean isReady() {\nreturn data != null;\n}\npublic void setPair(Object first, Object second) {\nPair tmp = new Pair();\ntmp.setFirst(first);\ntmp.setSecond(second);\ndata = tmp; // guaranteed to have both\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#11", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Volatile", "Header 3": "Fixing Assignment Atomicity", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#11", "page_content": "Depending on the implementation a long or double assignment `double x = 3;` is not atomic, it will most lightly write 32 bits at a time. To prevent this we can make the double volatile which will guarantee the assignment to be atomic."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#12", "metadata": {"Header 1": "JMM - Java Memory Model", "Header 2": "Double-checked Locking Problem", "path": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/javaMemoryModel.mdx#12", "page_content": "We want a Singleton that has lazy initialization that is also thread-safe. Our first attempt could be something like the code below with the `getInstance()` function being synchronized so that we don't run into problems. And this works fine however it is very expensive because for every getInstance we have the synchronization overhead. \n```java\npublic class Singleton {\nprivate static Singleton instance;\npublic synchronized static Singleton getInstance() {\nif(instance == null) {\ninstance = new Singleton();\n}\nreturn instance;\n}\nprivate Singleton() { /* initialization */ }\n}\n``` \nTo fix this we need to do so-called double-checking. We also need to make the instance volatile to prevent there being uninitialized objects. \n```java\npublic class Singleton {\nprivate volatile static Singleton instance;\npublic static Singleton getInstance() {\nif(instance == null) {\nsynchronized(Singleton.class) {\nif(instance == null) {\ninstance = new Singleton();\n}\n}\n}\nreturn instance;\n}\nprivate Singleton() { /* initialization */ }\n// other methods\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#1", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Disadvantages of Locks", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#1", "page_content": "Locks are very useful and do their job well however they do have some disadvantages. Because of the context switching between threads, there can be an overhead and performance can suffer. However, probably the biggest disadvantage is contention. When a thread is waiting for a lock it cannot do anything else. If a thread that holds a lock is delayed or even ends up in a deadlock then no other thread that needs the lock can progress. This can then lead to **priority inversion** which is when a high priority thread is waiting for a lock held by a low priority thread and therefore its priority is effectively downgraded. \nThe example below works perfectly fine but we want to remove the locks because of the previously mentioned issues. The lock for reading the value can be removed by making the value volatile so that there is a visibility guarantee. However, volatile variables do not support read-modify-write sequences which is what we are doing when incrementing the value. So we are still stuck with a lock for incrementing and we still don't have optimal performance due to the overhead of volatile variables. \n```java\npublic final class Counter1 {\nprivate int value = 0;\npublic synchronized int getValue() { return value; }\npublic synchronized int increment() { return ++value; }\n}\npublic final class Counter2 {\nprivate volatile int value = 0;\npublic int getValue() { return value; }\npublic synchronized int increment() { return ++value; }\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#2", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "CAS - Compare and Swap", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#2", "page_content": "CPUs have an atomic instruction called compare and swap/set, `CAS(memory_location, expected_old_value, new_value)`. This operation atomically compares the content of a memory location to a given value and if they are the same modifies the content of that memory location to a given new value and returns a boolean corresponding to if the swap was done, i.e the value at the memory location was still the same as the given old value. With this operation, we can remove all of the locks in the Counter class: \n```java\npublic final class CASCounter {\nprivate volatile int value = 0;\n\npublic int getValue() {\nreturn value;\n}\npublic int increment() {\nwhile(true) {\nint current = getValue();\nint next = current + 1;\nif (compareAndSwap(current, next)) return next;\n}\n}\n\n// Wrapper for old sun microsystems implementation\nprivate static final Unsafe unsafe = Unsafe.getUnsafe();\nprivate static final int valueOffset;\nstatic {\ntry {\nvalueOffset = unsafe.objectFieldOffset(CASCounter.class.getDeclaredField(\"value\"));\n} catch (Exception ex) { throw new Error(ex); }\n}\nprivate boolean compareAndSwap(int expectedVal, int newVal) {\nreturn unsafe.compareAndSwap(this, valueOffset, expectedVal, newVal);\n}\n}\n``` \nThis pattern is also commonly referred to as optimistic locking. It is optimistic because the code gets the old value, modifies it and optimistically hopes that in the meantime the value hasn't changed and then tries to swap the old and new value if the old value is still the same. If the value has changed in the meantime by maybe another thread then it just tries again and again until it works."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#3", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Atomics", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#3", "page_content": "Java added Atomic Scalars which support CAS and atomic arithmetic operations for int/long. For doubles or floats etc you can use `Double.doubleToRawLongBits()` and then convert back with `Double.longBitsToDouble()`. \n \n```java\nclass AtomicInteger extends Number {\nAtomicInteger()\nAtomicInteger(int initialValue)\nboolean compareAndSet(int expect, int update)\nint incrementAndGet() int decrementAndGet()\nint getAndIncrement() int getAndDecrement()\nint addAndGet(int delta) int getAndAdd(int delta)\nint getAndSet(int newValue)\nint intValue() double doubleValue()\nlong longValue() float floatValue()\nint get() void set(int newValue)\n}\n``` \nThe Counter example would then look something like this: \n```java\npublic final class AtomicCounter {\nprivate final AtomicInteger value = new AtomicInteger(0);\npublic int getValue() {\nreturn value.get();\n}\npublic int increment() {\nwhile (true) {\nint oldValue = value.get();\nint newValue = oldValue + 1;\nif (value.compareAndSet(oldValue, newValue)) return newValue;\n}\n}\n}\n``` \nOr even shorter: \n```java\npublic final class AtomicCounter {\nprivate final AtomicInteger value = new AtomicInteger(0);\npublic int getValue() {\nreturn value.get();\n}\npublic int increment() {\nreturn value.incrementAndGet();\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#4", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Atomics", "Header 3": "Atomic References", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#4", "page_content": "We might not just want to work with integers, we want to be able to work with any object. For this reason, there is the AtomicReference class. For example, it can get a bit tricky when there are multiple integer values if we wanted to implement a range: \n```java\npublic class NumberRange {\nprivate final AtomicInteger lower = new AtomicInteger(0);\nprivate final AtomicInteger upper = new AtomicInteger(0);\n\npublic int getLower() { return lower.get(); }\npublic void setLower(int newLower) {\nwhile (true) {\nint l = lower.get(), u = upper.get(); // get current values\nif (newLower > u) throw new IllegalArgumentException(); // check preconditions\nif (lower.compareAndSet(l, newLower)) return;\n}\n}\n// same for getUpper/setUpper\npublic boolean contains(int x) {\nreturn lower.get() <= x && x <= upper.get();\n}\n}\n``` \nSo instead we can work with AtomicReferences: \n```java\npublic class NumberRange {\nprivate static class Pair {\nfinal int lower, upper; // lower <= upper\nPair(int l, int u) { lower = l; upper = u; }\n}\n\nprivate final AtomicReference values = new AtomicReference<>(new Pair(0,0));\n\npublic int getLower(){ return values.get().lower; }\npublic void setLower(int newLower){\nwhile(true) {\nPair oldp = values.get();\nif(newLower > oldp.upper) throw new IllegalArgumentException(); // could also check preconditions in constructor\nPair newp = new Pair(newLower, oldp.upper);\nif(values.compareAndSet(oldp, newp)) return; // uses == comparison, which is why should work with immutable\n}\n}\n}\n``` \n\nBe careful when using integer literals because the JVM does some special things, like caching small integer literals which leads to the following program having unexpected behavior. \n```java\nstatic AtomicReference as;\npublic static void main(String[] args) throws Exception {\nnew Thread(() -> {\nas = new AtomicReference<>(1);\nas.compareAndSet(1,2);\n}).start();"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#5", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Atomics", "Header 3": "Atomic References", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#5", "page_content": "new Thread(() -> System.out.println(as.get())).start();\n}\n``` \nWe would expect to get a NullPointerException or the value 1 but not the value 2. Because the value 1 gets auto-boxed twice with Integer.valueOf() to different objects the compareAndSet should fail. But it doesn't 2 is also a possible output because the JVM caches small integer values.\n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#6", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Atomics", "Header 3": "ABA Problem", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#6", "page_content": "The ABA problem occurs in lock-free programming when a variable that was read has been changed by another thread in the following order: \n`A -> B -> A` \nThe CAS operation will compare its A with A and think that \"nothing has changed\" even though the second thread did work which violates that assumption. For example \n1. Thread T1 reads value A from shared memory.\n2. T1 is put to sleep, allowing thread T2 to run.\n3. T2 modifies the shared memory value A to value B and back to A before going to sleep.\n4. T1 begins execution again, sees that the shared memory value has not changed and continues. \nFor this reason, Java provides the AtomicStampedReference Class which holds an object reference and a stamp internally. The reference and stamp can be swapped using a single atomic compare-and-swap operation, via the compareAndSet() method. \n```java\npublic class AtomicStampedReference {\npublic AtomicStampedReference(V ref, int stamp) { ... }\npublic V getReference() { ... } // returns reference\npublic int getStamp() { ... } // returns stamp\npublic V get(int[] stampHolder) { ... } // returns both\npublic void set(V newReference, int newStamp) { ... }\npublic boolean compareAndSet(V expectedReference, V newReference, int expectedStamp, int newStamp) { ... }\npublic boolean attemptStamp(V expectedReference, int newStamp) { ... }\n}\n``` \n```java\nprivate final AtomicStampedReference account = new AtomicStampedReference<>(100, 0); // initial value=100 stamp=0\n\npublic int deposit(int funds) {\nint[] stamp = new int[1];\nwhile(true){\nint oldValue = account.get(stamp);\nint newValue = oldValue + funds;\nint newStamp = stamp[0] + 1;\nif(account.compareAndSet(oldValue, newValue, stamp[0], newStamp);)\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#7", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Non-blocking Data structures", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#7", "page_content": "With the Atomic Scalars in Java, you can then also implement some simple data structures."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#8", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Non-blocking Data structures", "Header 3": "Stack", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#8", "page_content": "```java\npublic class ConcurrentStack {\nprivate static class Node {\npublic final E item;\npublic Node next;\npublic Node(E item) { this.item = item; }\n}\n\nfinal AtomicReference> head = new AtomicReference<>();\n\npublic void push(E item) {\nNode newHead = new Node(item);\nwhile(true) {\nNode oldHead = head.get();\nnewHead.next = oldHead;\nif (head.compareAndSet(oldHead, newHead)) return;\n}\n}\npublic E pop() {\nwhile(true) {\nNode oldHead = head.get();\nif (oldHead == null) throw new EmptyStackException();\nNode newHead = oldHead.next;\nif(head.compareAndSet(oldHead, newHead)) {\nreturn oldHead.item;\n}\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#9", "metadata": {"Header 1": "Lock-Free Programming", "Header 2": "Non-blocking Data structures", "Header 3": "Queue", "path": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/lockFreeProgramming.mdx#9", "page_content": "The tricky part of implementing a non-blocking queue is that two things need to be watched, the head and the tail. In the implementation below a dummy node is used. This then leads to there being 3 states that the tail can be in: \n- The tail refers to the dummy i.e. to the same node as the head then the queue is empty.\n- The tail refers to the last element.\n- The tail refers to the second last element, which can only happen in the middle of an update. \n```java\npublic class ConcurrentQueue {\nprivate static class Node {\nfinal E item;\nfinal AtomicReference> next;\npublic Node(E item, Node next) {\nthis.item = item;\nthis.next = new AtomicReference>(next);\n}\n}\n\nprivate final Node dummy = new Node(null, null);\nprivate final AtomicReference> head = new AtomicReference>(dummy);\nprivate final AtomicReference> tail = new AtomicReference>(dummy);\n\npublic boolean put(E item) {\nNode newNode = new Node(item, null);\nwhile (true) {\nNode curTail = tail.get();\nNode tailNext = curTail.next.get();\nif (tailNext != null) {\n// Queue in intermediate state, advance tail\ntail.compareAndSet(curTail, tailNext);\n} else {\n// In consistent state, try inserting new node\nif (curTail.next.compareAndSet(null, newNode)) {\n// Insertion succeeded, try advancing tail\ntail.compareAndSet(curTail, newNode);\nreturn true;\n}\n}\n}\n\npublic E pop() {\nwhile(true) {\nNode oldHead = head.get();\nif (oldHead == null) throw new EmptyQueueException();\nNode newHead = oldHead.next.get();\nif(head.compareAndSet(oldHead, newHead)) {\nreturn oldHead.item;\n}\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#1", "metadata": {"Header 1": "Locking", "Header 2": "Interleavings", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#1", "page_content": "Interleaving is a possible way in which a series of statements could be executed. This concept is important because in concurrent programming the interleaving of a program could influence the result. Choosing the interleaving is however not up to us but the scheduler. \n \nThe picture above shows some possible interleaving of a program split up between two threads. \n```java\nclass Counter {\nprivate int i = 0;\npublic void inc() { i++; }\npublic int getCount() { return i; }\n}\nclass R implements Runnable {\nprivate Counter c;\npublic R(Counter c) { this.c = c; }\npublic void run() {\nfor (int i = 0; i < 100000; i++) {\nc.inc();\n}\n}\n}\npublic class CounterTest {\npublic static void main(String[] args) {\nCounter c = new Counter();\nRunnable r = new R(c);\nThread t1 = new Thread(r); Thread t2 = new Thread(r);\nThread t3 = new Thread(r); Thread t4 = new Thread(r);\nt1.start(); t2.start(); t3.start(); t4.start();\ntry {\nt1.join(); t2.join(); t3.join(); t4.join();\n} catch (InterruptedException e) {}\nSystem.out.println(c.getCount());\n}\n}\n``` \nIf we execute the above code we could expect the result to be 400000 because there are 4 threads and each thread increases the counter 100000 times and we only output the result once all threads have terminated. However, when executing this program this is not the case we might see something like 108600 and if we execute it another time 118127. These results happen because the scheduler is allowed to switch context between every CPU operation. So we can see that read and write operations are not guaranteed to be atomic meaning it is done as one instruction by the CPU. Even writing to a value of the type double might be done in 2 parts, it might assign the first 32 bits and then the next 32 bits. In the example, the scenario below happend a few times which causes modifications to get lost. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#2", "metadata": {"Header 1": "Locking", "Header 2": "Interleavings", "Header 3": "Interleaving Model", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#2", "page_content": "The interleaving model is used to calculate the number of possible interleavings (size of the set of possible interleavings) depending on the number of threads $n$ and the number of atomic instructions $m$. \n$$\ninterleavings = \\frac{(n \\cdot m)!}{(m!)^n}\n$$ \nFor example, if there are 2 threads and a program with 3 atomic instructions then there are 20 possible ways the program could be executed across the 2 threads. Just by increasing the number of threads to 4 the number of possible interleavings skyrockets to 369'600."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#3", "metadata": {"Header 1": "Locking", "Header 2": "Race Conditions", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#3", "page_content": "A race condition can happen when a result depends on the interleaving of the program across two or more threads. Critically race conditions can also happen when two or more threads are accessing shared data and at least one of them is modifying the data. This leads to unpredictable results as thread scheduling is nondeterministic."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#4", "metadata": {"Header 1": "Locking", "Header 2": "Synchronization", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#4", "page_content": "Synchronization is a technique of managing access to shared mutable data to prevent race conditions."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#5", "metadata": {"Header 1": "Locking", "Header 2": "Locks", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#5", "page_content": "A lock or mutex (from mutual exclusion) is a mechanism to enforce mutual exclusion i.e limits access to a resource when multiple threads want to access the resource. Mutual exclusion prevents simultaneous access by only allowing one thread at a time to access a shared resource and therefore guarding critical sections against concurrent execution. By locking a certain section you are also forcing atomicity as no other thread can enter that section of code whilst another thread holds it. This can be a double-edged as it makes the program thread-safe but also means that we are not making use of concurrency. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#6", "metadata": {"Header 1": "Locking", "Header 2": "Locks", "Header 3": "Built-in Locking in Java", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#6", "page_content": "Java has a built-in locking mechanism, the `synchronized` keyword. Locking consists of two parts: The object that will serve as a lock and a block of code, the critical section, that is guarded by the lock. When a thread reaches the synchronized block and the lock is not in use the thread can acquire the lock to the block. However, if the lock is not available because it has already been taken then the thread enters the waiting list. When a thread exits a synchronized section the lock is released and there is a race to which thread gets to acquire the lock next. Often the lock is just on the current instance (`this`) or class in a static context. This is what Java does by default if you do not specify a certain lock object. Something to be careful of is using String literals as a lock as it can [cause some big issues](https://stackoverflow.com/a/463437) because according to [Section 3.10.5 of the Java Language Specification](https://docs.oracle.com/javase/specs/jls/se18/html/jls-3.html#jls-3.10.5): Literal strings within different classes in different packages likewise represent references to the same String object. \n\nSynchronizing is not free it comes with additional code (monitorenter and monitorexit are added in the byte code) and also means that the compiler can make fewer optimizations.\n \nThe above example could be fixed by doing one of the following: \n```java\nclass Counter {\nprivate int i = 0;\nprivate final Object lock = new Object();\npublic synchronized void inc() { i++; }\n// OR public void inc() { synchronized(this){ i++; } }\n// OR public void inc() { synchronized(lock){ i++; } }\npublic int getCount() { return i; }\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#7", "metadata": {"Header 1": "Locking", "Header 2": "Locks", "Header 3": "Deadlock", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#7", "page_content": "A Deadlock is a situation where at least one thread is blocked because it is holding a resource and is waiting for another resource which is already being held by another thread that wants the other resource being held. So in other words the necessary conditions for a deadlock to happen are: \n- Mutual Exclusion\n- Hold and Wait, threads are requesting additional resources whilst also holding other resources.\n- No Preemption, resources are released exclusively by threads.\n- Circular Wait, two or more threads form a circular chain where each thread waits for a\nresource that the next thread in the chain holds. \n \n#### Global Ordering \nOne way of avoiding deadlocks is to order the way the locks are obtained so instead of having the following situation: \n \nWe can acquire the locks in lexicographical order. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#8", "metadata": {"Header 1": "Locking", "Header 2": "Locks", "Header 3": "Reentrancy", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#8", "page_content": "Synchronized is also reentrant. Meaning that the same lock can be acquired multiple times by the same thread. Java does this by keeping a counter for each lock with the initial value being 0. When a thread then acquires initially acquires the lock it sets the lock-id to the current thread and increments the counter. For each further acquisition of that lock, the counter is just further incremented. Each lock release then decrements the counter and once the counter reaches 0 again the lock is completely released and made available again to the other threads. The following examples do not cause a deadlock. \n```java\nsynchronized f() { g(); }\nsynchronized g() {\n/* no deadlock */\nsynchronized(x) {\nsynchronized(x) { /* still no deadlock */ }\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#9", "metadata": {"Header 1": "Locking", "Header 2": "Locks", "Header 3": "java.util Locks", "path": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/locking.mdx#9", "page_content": "Additionally to the `synchronized` keyword Java also offers some lock implementations that are more flexible. It is important to use these locks with a `try` block so that the lock can be released in the `finally` block in case any exceptions occur. \n```java\ninterface Lock{\nvoid lock() // Acquires the lock.\nvoid lockInterruptibly() // Acquires the lock unless the current thread is interrupted.\nCondition newCondition() // Returns a new Condition instance that is bound to this Lock instance.\nboolean tryLock() // Acquires the lock only if it is free at the time of invocation.\nboolean tryLock(long time, TimeUnit unit) // Acquires the lock if it is free within the given waiting time and the current thread has not been interrupted.\nvoid unlock() // Releases the lock.\n}\n``` \nUsage Pattern: \n```java\npublic synchronized void inc() {\nLock lock = ...;\n...\nlock.lock();\ntry {\n// access resources protected by this lock\n}\nfinally {\nlock.unlock(); // by the same thread!\n}\n}\n``` \n#### Reentrant Lock \nThe class `ReeentrantLock` implements the `Lock` interface. It offers the same functionality as when using the synchronized mechanism with some extra functions: \n- `int getHoldCount()` queries the number of holds on this lock by the current thread.\n- `Thread getOwner()` returns the thread that currently owns the lock, or null if not owned.\n- `Collection getQueuedThreads()` returns a collection containing threads that are waiting to acquire this lock.\n- `int getQueueLength()` returns an estimate of the number of threads waiting to acquire this lock. \nA fairness parameter can also be passed with the constructor to define whether the lock is fair or not. Fair locks let threads acquire the lock in the order it was requested i.e. the longest waiting thread always gets the lock (FIFO). An unfair lock is how synchronized works it lets the threads race to acquire the lock."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#1", "metadata": {"Header 1": "Safe Object Sharing", "path": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#1", "page_content": "There are 2 alternatives to synchronizing objects to make sure that nothing breaks when sharing objects. Either the shared object is immutable which would lead to there never being any inconsistent states between the threads. The other alternative is you just don't have a shared state variable between threads."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#2", "metadata": {"Header 1": "Safe Object Sharing", "Header 2": "Immutable Objects", "path": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#2", "page_content": "In principle, immutable objects aren't very complicated. You do however have to be aware of how the object is initialized when working concurrently as we don't want half- or even non-initialized objects. For example, in the example below the `account` object could be un- or partial-initialized. \n```java\n// immutable\nfinal class Account {\nprivate int balance;\npublic Account(int balance) {\nthis.balance = balance;\n}\npublic String toString() { return \"\" + balance; }\n}\nclass Company {\nprivate Account account = null;\npublic Account getAccount() { // lazy initialization\nif(account == null) account = new Account(10000);\nreturn account;\n}\n}\n``` \nTo illustrate how `account` could break we can imagine that we have two threads, `T0` and `T1`. If we then call `T0: company.getAccount().toString();` and `T1: company.getAccount().toString();` we don't have a guaranty that we get 10000, we could also get 0. The reason for this is that there could be an interleaving between the object creation and the assignment of the `balance` field, resulting in a partial-initialized object. To fix this we could make the `account` field volatile. The happens-before relation then guarantees that fields set in the constructor are visible as the invocation of the constructor happens-before the assignment to the volatile field `account`. \n```java\nclass Company {\nprivate volatile Account account = null; // safe publication\npublic Account getAccount() {\nif(account == null) account = new Account(10000);\nreturn account;\n}\n}\n``` \nUsing volatile for this is very expensive as we have previously seen and means that the CPU can't make performance optimizations by caching values and we only really want the functionality of volatile for the first initialization and not for any further calls of `getAccount()`. For this reason, the JMM guarantees that final fields are only visible after they have been initialized! This means that if a thread sees a reference to an Account instance, it has the"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#3", "metadata": {"Header 1": "Safe Object Sharing", "Header 2": "Immutable Objects", "path": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#3", "page_content": "guarantee to see all the final fields fully initialized. The JMM also guarantees that if a reference of an object is final, all referenced objects are visible after initialization if accessed over the final reference. \n```java\nclass Account {\nprivate final int balance;\npublic Account(int balance) { this.balance = balance; }\npublic String toString() { return \"\" + balance; }\n}\n``` \nInitialization-Safety is however only guaranteed if an object is accessed after it is fully constructed. For this to be the case you can not allow the `this` reference to escape during construction. Some possible ways the `this` reference could escape: \n- Publishing an instance of an inner class. This implicitly publishes the enclosing instance as well because the inner class instance contains a hidden reference to the enclosing instance. For example when registering an event listener from the constructor. \n- Starting a thread within a constructor. When an object creates a thread from its constructor, it almost always shares its `this` reference with the new thread. Either explicitly, by passing it to the constructor or implicitly, because the Thread or Runnable is an inner class of the owning object. The new thread might then be able to see the owning object before it is fully constructed. \n- Calling an alien method in the constructor. An alien methods behavior is not fully specified by the invoking class because it is either in another class or an overridable method."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#4", "metadata": {"Header 1": "Safe Object Sharing", "Header 2": "Thread Locals", "path": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#4", "page_content": "The [DateFormat Class](https://docs.oracle.com/javase/7/docs/api/java/text/DateFormat.html) in Java is documented not to be thread-safe. Instead, it is recommended we use a fresh instance on every invocation or a separate instance for each thread. \n```java\npublic class BadFormatter {\nprivate static final SimpleDateFormat sdf = new SimpleDateFormat();\npublic static String format(Date d) {\nreturn sdf.format(d);\n}\n}\npublic class GoodFormatter {\npublic static String format(Date d) {\nSimpleDateFormat sdf = new SimpleDateFormat();\nreturn sdf.format(d);\n}\n}\n``` \nIn the solution above we are creating a fresh instance for each call which can be quite expensive. Instead, we can use the [ThreadLocal class](https://docs.oracle.com/javase/7/docs/api/java/lang/ThreadLocal.html). A thread-local variable provides a separate copy of its value for each thread that uses it. It, therefore, provides a mechanism to pass state down the call stack without having to explicitly define an additional method parameter. \n```java\nclass ThreadLocal {\npublic T get();\npublic void set(T value);\nprotected T initialValue();\npublic void remove();\n}\n``` \nWe could then solve the above problem like the following \n```java\nclass ThreadLocalFormatter {\nprivate static ThreadLocal local = ThreadLocal.withInitial(() -> new SimpleDateFormat());\npublic static String format(Date d) {\nreturn local.get().format(d);\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#5", "metadata": {"Header 1": "Safe Object Sharing", "Header 2": "Thread Locals", "Header 3": "ThreadLocalRandom", "path": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/safeObjectSharing.mdx#5", "page_content": "Although the [Random class](https://docs.oracle.com/javase/8/docs/api/java/util/Random.html) is thread-safe the concurrent use of the same Random instance across threads may encounter [thread contention](https://stackoverflow.com/questions/1970345/what-is-thread-contention) and consequently have poor performance. For this reason the [ThreadLocalRandom class](https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadLocalRandom.html) was added. \n```java\n// Usage\nThreadLocalRandom.current().nextInt() // current returns the current thread's ThreadLocalRandom instance\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#1", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Advantages of Scala for Concurrency", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#1", "page_content": "Scala has many advantages for why it is popular for concurrent programs, however, the main reason is that it is heavily influenced by functional languages and has a large focus on immutability which eases concurrent programming heavily. Because Scala is built up on the JVM it already supports thread-based concurrency and the executor framework. But it also builds up on that with the Software Transactional Memory (STM) system which is inspired by Clojure and Haskell. It also offers the Akka library which supports type-safe Actor-based concurrency just like in Erlang."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#2", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Option", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#2", "page_content": "An object of `Option` represents an optional value, either it is present or it isn't. Instances of Option are either an instance of `Some` or the object `None`. \n```scala\nval o1: Option[Int] = None // same as Option(null)\nval o2 = Some(10)\nprintln(o1.isDefined)\nprintln(o1.isEmpty)\nprintln(o1.getOrElse(\"Empty\"))\nprintln(o2.get)\n``` \nAn Option is optimal for pattern matching: \n```scala\nval result = List(1,2,3).find(i => i >= 2) match {\ncase None => \"Not Found!\"\ncase Some(elem) => \"Found: \" + elem\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#3", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Collections", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#3", "page_content": "Scala has mutable and immutable collections each in their corresponding sub-package `scala.collection.immutable` and `scala.collection.mutable`. A mutable collection can be updated or extended in place. This means when you change, add, or remove elements of a mutable collection it is done in place as a side effect. However, immutable collections never change. Operations that change, add or remove elements return a new immutable collection and leave the old collection unchanged. In scala the default is to use the immutable, if you want the mutable version you need to explicitly import it."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#4", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Collections", "Header 3": "Arrays", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#4", "page_content": "In scala, an array is a mutable sequence of objects that share the same type with a fixed length that is given when the array is instantiated. \n```scala\nval reserved = new Array[String](3) // calls constructor in Array class\nval words = Array(\"zero\", \"one\", \"two\") // calls apply() factory in companion object\n\nfor (i <- 0 to 2) // to is a method using operator notation returning a sequence, 0.to(2)\nprintln(words(i)) // calls apply function / () operator in Array class\n\nwords(0) = \"nothing\" // shorthand for words.update(0, \"nothing\")\n\nwords.foreach(println) // shorthand for one argument with println(_)\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#5", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Collections", "Header 3": "Lists and Sets", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#5", "page_content": "In scala, `List` is a concrete class, not an interface as in Java. Meaning we can create immutable or mutable List objects. The advantage of a List is that it can contain an arbitrary amount of elements because it is implemented as a linked list. But just like with arrays a List can only contain objects of the same type. \n```scala\nval list0 = List(1, 2, 3) // List(1,2,3)\nval head = list0.head // 1, first element\nval tail = list0.tail // List(2,3), the rest\nval init = list0.init // List(1,2), all but last\nval reversed = list0.reverse // List(3,2,1)\nval list1 = 0 :: list0 // List(0,1,2,3), prepend\nval list2 = Nil // List() or List.empty\nval list3 = list0.map(i => i + 1) // List(2,3,4)\nval list4 = list0.filter(i => i > 1) // List(2,3)\nval sum0 = list0.reduce((x, y) => x + y) // 6\nval sum1 = list0.sum // 6\nval count = list0.count(i => i > 1) // 2\nval list5 = list0.zip(List('A', 'B', 'C')) // List((1,A), (2,B), (1,C))\nval list6 = list0.groupBy(i => i % 2 == 0) // Map(false -> List(1,1), true -> List(2))\nval large = list0.find(i => i > 12) // None\nval small = list0.find(i => i < 12) // Some(1)\nval list7 = list0.drop(0) // List(2,3)\nval list8 = list0.dropRight(2) // List(1), without 2 right most elements\nval list9 = list0 ::: List(3,5,6) // List(1,2,3,4,5,6)\nlist0.foreach(i => print(i + \" \")) // 1 2 3\nprintln(list0.mkString(\", \")) // 1, 2, 3\n``` \nA Set is the same as a List but can only contain unique values. \n```scala\nval set0 = Set(1,2,3,2) // Set(1,2,3)\nval set1 = set0 + 4 // Set(1,2,3,4)\nval set2 = set0 - 1 // Set(2,3)\nval contains = set1(0) // false, same as set1.contains(0)\nval set3 = set1.filter(i => i > 2) // Set(3,4)\nval set4 = set1.map(i => i > 2) // Set(false,true)\nval subset = set2 subsetOf set0 // true\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#6", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Collections", "Header 3": "Tuples", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#6", "page_content": "Tuples are like lists but can contain different types of elements. They are commonly used for returning multiple values from a function. \n```scala\nval pair = (99, \"Luftballons\") // is inferred to the type Tuple2[Int, String]\nval num = pair(0)\nval what = pair(1)\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#7", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Collections", "Header 3": "Maps", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#7", "page_content": "```scala\nval map0 = Map(1 -> \"one\", 2 -> \"two\") // Map(1 -> \"one\", 2 -> \"two\")\nval map1 = map0 + (3 -> \"three\") // Map(1 -> \"one\", 2 -> \"two\", 3 -> \"three\")\nval map2 = map1 - 1 // Map(2 -> \"two\", 3 -> \"three\")\nval val1 = map0(1) // \"one\"\nval val0 = map0(0) // j.u.NoSuchElementException: key not found: 0\nval optVal0 = map0.get(0) // None\nval optVal1 = map0.get(1) // Some(1)\nval res = map1.filter(kv => kv._1 > 2) // Map(3 -> \"three\")\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#8", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Collections", "Header 3": "Parallel Collections", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#8", "page_content": "Parallel collections were included in the Scala standard library to enable parallel programming without users needing to know low-level details by providing a simple high-level abstraction. Some prime operations for parallelization are: \n```scala\nval list = (1 to 10000).toList\nval res = list.map(i => i * 3)\nval even = list.filter(i => i % 2 == 0)\nval sum = list.reduce((i,j) => i + j)\n\nval par_res = list.par.map(i => i*3).seq.toList\nval par_even = list.par.filter(i => i % 2 == 0).seq.toList\nval par_sum = list.par.reduce((i,j) => i + j)\n``` \nThere are some things you do need to be aware of when using parallel collections. For example, if the collection isn't very big then the setup for parallelizing the functions might be larger than the performance gain. The other thing to be aware of is non-deterministic functions such as non-associative operations. \n```scala\n(1 to 5).foreach(print) // 12345\n(1 to 5).par.foreach(print) // depending on execution 34512\n(1 to 1000).par.reduce(_ - _) // depending on execution -330101\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#9", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Futures", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#9", "page_content": "The problem with Javas Futures is that the get call is blocking. \n```scala\npublic Future[String] loadHomePage() { ... }\npublic Map indexContent(String content) { ... }\npublic void work() throws Exception {\n// Block current Thread until result is available\nString content = loadHomePage().get();\nMap index = indexContent(content);\nSystem.out.println(index);\n}\n``` \nInstead, we would much rather make use of the observable pattern instead of having to wait for results. In Scala, by default, futures are non-blocking as they make use of callback functions instead of typical blocking operations. However, blocking is still possible but is heavily discouraged. Futures in Scala are defined as followed: \n```scala\nobject Future {\ndef apply[T](task: => T)(implicit ec: ExecutionContext): Future[T]\n}\n// Usage:\nval inverseFuture : Future[Matrix] = Future {\nfatMatrix.inverse() // non-blocking long lasting computation\n}(executionContext)\n\n// or in short\nimport scala.concurrent.ExecutionContext.Implicits.global\nval inverseFuture : Future[Matrix] = Future {\nfatMatrix.inverse()\n}\n``` \nLet’s assume we want to fetch a list of recent posts and display them. We can register a callback by using the `onComplete[U](f: Try[A] => U]): Unit` method, where Try is very similar to `Option` and can have the value of type `Success[T]` if the future completes successfully, or a value of type `Failure[T]` otherwise. \n```scala\nval f: Future[List[String]] = Future {\nsession.getRecentPosts()\n}\n\nf.onComplete {\ncase Success(posts) => for (post <- posts) println(post)\ncase Failure(t) => println(\"An error has occurred: \" + t.getMessage)\n}\n``` \nRegistering a `foreach` callback has the same semantics as onComplete, with the difference that the closure is only called if the future is completed successfully. \n```scala\nval f: Future[List[String]] = Future {\nsession.getRecentPosts\n}"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#10", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Scala Futures", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#10", "page_content": "f.foreach { posts =>\nfor (post <- posts) println(post)\n}\n``` \nGiven a future and a mapping function for the value of the future you can produce a new future that is completed with the mapped value once the original future is successfully completed with the `map` combinator. \n```scala\nval rateQuote = Future {\nconnection.getCurrentValue(USD)\n}\n\nval purchase = rateQuote map { quote =>\nif (isProfitable(quote)) connection.buy(amount, quote)\nelse throw new Exception(\"not profitable\")\n}\n\npurchase foreach { amount =>\nprintln(\"Purchased \" + amount + \" USD\")\n}\n``` \nBut what happens if isProfitable returns false, hence causing an exception to be thrown? In that case purchase fails with that exception. Furthermore, imagine that the connection was broken and that getCurrentValue threw an exception, failing rateQuote. In that case we’d have no value to map, so the purchase would automatically be failed with the same exception as rateQuote. \nIn conclusion, if the original future is completed successfully then the returned future is completed with a mapped value from the original future. If the mapping function throws an exception the future is completed with that exception. If the original future fails with an exception then the returned future also contains the same exception. \nThe `flatmap` does basically the same thing but also wraps it into a future: \n```scala\nval usdQuote = Future { connection.getCurrentValue(USD) }\nval chfQuote = Future { connection.getCurrentValue(CHF) }\n\nval purchase = usdQuote flatMap {\nusd =>\nchfQuote\n.withFilter(chf => isProfitable(usd, chf))\n.map(chf => connection.buy(amount, chf))\n}\n``` \nYou can as mentioned however also use blocking calls on futures: \n```scala\nval result = Await.result(homepage, 1 second)\nval result = Await.result(homepage, Duration.Inf)\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#11", "metadata": {"Header 1": "Concurrent Programming in Scala", "Header 2": "Reactive Programming with RxScala", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaConcurrency.mdx#11", "page_content": "ReactiveX (Rx) is a library for composing asynchronous and event-based programs by using observable sequences. It extends the observer pattern to support sequences of data and/or events. It offers the following structures: \nAn Observable represents an observable sequence of events much like an Iterable: \n```scala\ntrait Observable[T] {\ndef subscribe(obs: Observer[T]): Subscription\n}\ntrait Observer[T] {\ndef onNext(t: T): Unit\ndef onCompleted(): Unit\ndef onError(t: Throwable): Unit\n}\ntrait Subscription {\ndef unsubscribe(): Unit\ndef isUnsubscribed(): Boolean\n}\n\n// Observable[String] emitting some HTML strings\ngetDataFromNetwork()\n.drop(7)\n.filter(s => s.startsWith(\"h\"))\n.take(12)\n.map(s => toJson(s))\n.subscribe(j => println(j)) // instead of foreach\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#1", "metadata": {"Header 1": "STM - Software Transactional Memory", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#1", "page_content": "Up till now, we have always been writing about how we want something to work atomically and with a lot of bloat code and knowledge of what goes on in the background. For example, just a simple thread-safe transfer method can quickly become very complicated: \n```java\npublic void transfer(Account from, Account to, double amount) throws InactiveException, OverdrawException {\nAccount x, y;\n// lexicographically order locks\nif (from.getNumber().compareTo(to.getNumber()) < 0) {\nx = from; y = to;\n} else {\nx = to; y = from;\n}\nsynchronized (x) {\nsynchronized (y) {\nfrom.withdraw(amount);\ntry {\nto.deposit(amount);\n} catch (InactiveException e) {\nfrom.deposit(amount); // if failed load money back\nthrow e;\n}\n}\n}\n}\n``` \nInstead, we would much rather just be able to say something like the following: \n```scala\ndef transfer(from: Account, to: Account, amount: Double): Unit = {\natomic { implicit tx =>\nfrom.withdraw(amount)\nto.deposit(amount)\n}\n}\n``` \nAnd we can do something very similar to this with the software transactional memory (STM) system in scala. The STM is a coordination mechanism for shared memory and can therefore coordinate access to heap locations in a concurrent environment. \nThe STM is heavily inspired by transactions for databases where you have the ACID principle (atomic, consistent, isolated, durable). In other words, a transaction is a sequence of reading and writing operations to shared memory that occur logically (consistent) at a single instant in time (atomic) and where the intermediate state is not visible to other transactions (isolated). \nJust like for databases at the end of a transaction the state is checked for any conflicts. If there is a conflict the transaction is aborted and retried, if there isn't then the changes are made permanent and visible to the other transactions. \nSo it is very similar to working with CAS: \n```java\n@volatile\nprivate bal: Int = 0;"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#2", "metadata": {"Header 1": "STM - Software Transactional Memory", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#2", "page_content": "def deposit(amount: Int): Unit = {\nwhile (true) {\nval oldBal = bal; // read current value\nval newBal = oldBal + amount; // compute new value\nif (compareAndSet(addr(bal), oldBal, newBal)) {\nreturn; // commit successful -> return\n}\n// conflict -> retry\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#3", "metadata": {"Header 1": "STM - Software Transactional Memory", "Header 2": "ScalaSTM", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#3", "page_content": "To use ScalaSTM you need to add the dependency to your project and then import it with `import scala.concurrent.stm._`"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#4", "metadata": {"Header 1": "STM - Software Transactional Memory", "Header 2": "ScalaSTM", "Header 3": "Ref and Atomic", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#4", "page_content": "ScalaSTM offers the `Ref` class which can be used as a wrapper (mutable cell) for a reference. The access to this reference is coordinated by the STM system. The reference held by the wrapper should be immutable otherwise the reference can be changed via the reference and the STM system can not coordinate these changes. \n```scala\nval ref: Ref[Int] = Ref(1)\nval refView: Ref.View[Int] = ref.single\n``` \nSingle-operation memory transactions may be performed without an explicit atomic block using the `Ref.View` returned from `ref.single`. Otherwise, Ref is only allowed to be changed inside the static scope of an atomic block. Reads and writes of a Ref are performed by using `x.get` and `x.set(newValue)`, or more concisely by `x()` and `x() = newValue`. \n```scala\nobject CheatSheet extends App {\n\nval x = Ref(10) // allocate a Ref[Int]\nval y = Ref.make[String]() // type-specific default, holds no reference\nval z = x.single // Ref.View[Int]\n\n// can perform single operations on Ref.View objects\nz.set(11) // will act as if in atomic block\nprintln(z())\nval success = z.compareAndSet(11, 12)\nval old = z.swap(13) // old: Int\nprintln(old)\n\n// println(x()) can only be done in atomic block\n\natomic { implicit txn =>\nval i = x() // read\ny() = \"x was \" + i // write\nz() = 10\nval eq = atomic { implicit txn => // nested atomic\nx() == z() // both Ref and Ref.View can be used inside atomic\n}\nassert(eq)\ny.set(y.get + \", long-form access\")\n}"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#5", "metadata": {"Header 1": "STM - Software Transactional Memory", "Header 2": "ScalaSTM", "Header 3": "Ref and Atomic", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#5", "page_content": "atomic { implicit txn =>\nval i = x() // read\ny() = \"x was \" + i // write\nz() = 10\nval eq = atomic { implicit txn => // nested atomic\nx() == z() // both Ref and Ref.View can be used inside atomic\n}\nassert(eq)\ny.set(y.get + \", long-form access\")\n}\n\n// atomic transformation\nz.transform {\n_ max 20\n}\nval pre = y.single.getAndTransform {\n_.toUpperCase\n}\nval post = y.single.transformAndGet {\n_.filterNot {\n_ == ' '\n}\n}\n}\n``` \nThe atomic function is defined as `def atomic[Z](block: InTxn => Z): Z` and takes a parameter of type InTxn which provides a context for the transaction to be executed. The context has to be marked implicit as it is automatically pulled in. Luckily the atomic function is composable so we the code arrives at an atomic block it checks if it can join an existing tx or it creates a new so we can then do something like this: \n```scala\nclass STMAccount(val id: Int) {\nprivate val balance = Ref(0d)\ndef withdraw(a: Double) {\natomic { implicit txn =>\nbalance() = balance() – a\n}\n}\ndef deposit(a: Double) {\natomic { implicit txn =>\nbalance() = balance() + a\n}\n}\n}\nclass STMBank {\ndef transfer(amount: Double, from: STMAccount, to: STMAccount) {\natomic { implicit txn =>\nto.deposit(amount)\nfrom.withdraw(amount)\n}\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#6", "metadata": {"Header 1": "STM - Software Transactional Memory", "Header 2": "ScalaSTM", "Header 3": "Exceptions in Atomic", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#6", "page_content": "If an exception occurs inside an atomic block it can be caught and handled inside the atomic block. But if it is not caught then the transaction is rolled back and the exception is thrown higher up. \n```scala\nval last = Ref(\"none\")\natomic { implicit txn =>\nlast() = \"outer\"\ntry {\natomic { implicit txn =>\nlast() = \"inner\"\nthrow new RuntimeException\n}\n} catch {\ncase _: RuntimeException =>\n}\n}\n\nprintln(last.single.get) // outer because inner was rolled back\n``` \nYou do have to be aware of some things tho for example the following will only output the value 0. This is because transactions are compositional meaning the inner transactions only commit once the outer transaction has committed: \n```scala\nObject Main extends App {\nval balance: Ref[Int] = Ref(0)\ndef pay(amount: Int) : Unit = atomic { implicit tx =>\nTxn.afterCommit(_ => println(\"Transfer:\" + amount))\nbalance += amount\nif(balance() < 0){\nthrow new RuntimeException\n}\n}\n\nval t1 = new Thread(() => { atomic { implicit tx =>\npay(2)\npay(-4)\n}})\nt1.start()\nprintln(balance.single.get)\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#7", "metadata": {"Header 1": "STM - Software Transactional Memory", "Header 2": "ScalaSTM", "Header 3": "Lifecycle Callbacks", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#7", "page_content": "The STM also offers some callback functions for certain lifecycle states: \n- `Txn.afterCommit(handler: Status => Unit)`\n- `Txn.afterRollback(handler: Status => Unit)`\n- `Txn.beforeCommit(handler: (InTxn) ⇒ Unit)(implicit txn: InTxn): Unit`\n- `Txn.rollback(cause: RollbackCause)(implicit txn: InTxnEnd): Nothing`\n- `Txn.retry(implicit txn: InTxn): Nothing` \nWith the Status either being `completed` when the transaction has been rolled back or committed or `decided`. \n```scala\ndef transfer(from: STMAccount, to: STMAccount, amount: Double) {\natomic { implicit txn =>\nto.deposit(amount)\nfrom.withdraw(amount)\nTxn.afterCommit { _ => sendMail(to.email, \"You've got $\" + amount) }\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#8", "metadata": {"Header 1": "STM - Software Transactional Memory", "Header 2": "ScalaSTM", "Header 3": "Behind the Scenes", "path": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/scalaSTM.mdx#8", "page_content": "Behind the scenes, the STM system keeps a global version counter for the last successfully committed transaction. Additionally, each Ref is marked with a so-called local version stamp which is the version of the last successfully committed transaction which modified the reference. When a new transaction is started the following is done: \n1. Transaction start: The new transaction stores the value of the global version counter locally, this is the so-called read version.\n2. Transaction body: Before a Ref is modified and read from, a local working copy is made of it and only this local copy is read from and written to. For every access of the Ref the local version stamp of the Ref is compared to the read version and if it is larger than the read version the transaction is aborted and retried.\n3. Transaction commit: All original Refs that were modified are locked (with a timeout to avoid deadlocks). Then the global version counter is incremented and copied locally for the transaction, this is the so-called write version. All Refs are then checked again and the transaction is aborted and retried if the version stamp > read version and the object is locked. Then finally the values and the write version are written to the original Refs and the locks are released."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#1", "metadata": {"Header 1": "Synchronizers", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#1", "page_content": "A synchronizer is any object that coordinates and synchronizes the control flow of threads based on its state. The simplest form synchronizer we have already used, being locks."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#2", "metadata": {"Header 1": "Synchronizers", "Header 2": "Semaphore", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#2", "page_content": "A semaphore is an integer variable that represents a resource counter which can also be interpreted as a number of permits to access the resource. The main usage for semaphores is to restrict the number of threads than can access some physical or logical resource. \n```java\npublic class Semaphore {\npublic Semaphore(int permits) {...}\n// acquires a permit, blocking until one is available, or the thread is interrupted.\npublic void acquire() throws InterruptedException {...}\n// acquires a permit, blocking until one is available.\npublic void acquireUninterruptibly() {...}\npublic void release() {...}\n}\n``` \n \nIt is for example perfect to implement the CarPark Class as previously seen: \n```java\nclass SemaphoreCarPark implements CarPark {\nprivate final Semaphore s;\npublic SemaphoreCarPark(int places) {\ns = new Semaphore(places);\n}\npublic void enter() {\ns.acquireUninterruptibly();\nlog(\"enter carpark\");\n}\npublic void exit() {\nlog(\"exit carpark\");\ns.release();\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#3", "metadata": {"Header 1": "Synchronizers", "Header 2": "Semaphore", "Header 3": "Lock Using a Semaphore", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#3", "page_content": "A binary semaphore (only holding 1 permit) can be used as a lock. The only problem with this lock is that it isn't reentrant and a different thread can release the lock that was originally acquired by a different thread. \n```java\nclass SemaphoreLock {\nprivate final Semaphore mutex = new Semaphore(1);\npublic void lock() { mutex.acquireUninterruptibly();}\npublic void unlock() { mutex.release(); }\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#4", "metadata": {"Header 1": "Synchronizers", "Header 2": "Read-Write Lock", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#4", "page_content": "The motivation for a ReadWriteLock is that if we use the same lock for reading and writing then only one thread can read at a time even tho there wouldn't be any problems if multiple threads could read at a time. To solve this a ReadWriteLock maintains a pair of locks, a lock for reading which can be held simultaneously by multiple readers and a write lock that can only be held by one thread. This leads to there being 2 possible states. Either one thread is writing or one or multiple threads are reading. \n```java\npublic interface ReadWriteLock {\nLock readLock(); // allows for concurrent reads\nLock writeLock(); // writes are exclusive\n}\n``` \n \n```java\nclass KeyValueStore {\nprivate final Map m = new TreeMap<>();\nprivate final ReadWriteLock rwl = new ReentrantReadWriteLock();\nprivate final Lock r = rwl.readLock();\nprivate final Lock w = rwl.writeLock();\npublic Object get(String key) {\nr.lock(); try { return m.get(key); } finally { r.unlock(); }\n}\npublic Set allKeys() {\nr.lock(); try { return new HashSet<>(m.keySet()); } finally { r.unlock(); }\n}\npublic void put(String key, Object value) {\nw.lock(); try { m.put(key, value); } finally { w.unlock(); }\n}\npublic void clear() {\nw.lock(); try { m.clear(); } finally { w.unlock(); }\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#5", "metadata": {"Header 1": "Synchronizers", "Header 2": "Countdown Latch", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#5", "page_content": "A CountDownLatch delays the progress of threads until the Latch reaches its terminal state. The main usage for a CoundDownLatch is to ensure that an activity does not proceed until another one-time action has been completed. \n```java\npublic class CountDownLatch {\npublic CountDownLatch(int count) {...}\n// Causes the current thread to wait until the latch has counted down to zero\npublic void await() {...}\n// Decrements the count, releasing all waiting threads if the count reaches zero.\npublic void countDown() {...}\npublic long getCount() {...}\n}\n``` \nHere there are two common scenarios. Either a thread wants to wait until some other actions are done, or a thread is used a sort of starting gun for other threads. \n \n \n```java\nclass KeyValueStore {\nprivate final Map m = new TreeMap<>();\nprivate final ReadWriteLock rwl = new ReentrantReadWriteLock();\nprivate final Lock r = rwl.readLock();\nprivate final Lock w = rwl.writeLock();\npublic Object get(String key) {\nr.lock(); try { return m.get(key); } finally { r.unlock(); }\n}\npublic Set allKeys() {\nr.lock(); try { return new HashSet<>(m.keySet()); } finally { r.unlock(); }\n}\npublic void put(String key, Object value) {\nw.lock(); try { m.put(key, value); } finally { w.unlock(); }\n}\npublic void clear() {\nw.lock(); try { m.clear(); } finally { w.unlock(); }\n}\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#6", "metadata": {"Header 1": "Synchronizers", "Header 2": "Cyclic Barrier", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#6", "page_content": "A CyclicBarrier allows a set of threads to all wait for each other to reach a common barrier point. \n```java\npublic class CyclicBarrier {\npublic CyclicBarrier(int nThreads) {...}\npublic CyclicBarrier(int nThreads, Runnable barrierAction)\npublic void await() {...}\n}\n``` \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#7", "metadata": {"Header 1": "Synchronizers", "Header 2": "Exchanger", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#7", "page_content": "An Exchanger allows two threads to wait for each other and exchange an object. This can be especially useful when the object is very big as it can be reused. \n```java\npublic class Exchanger {\npublic T exchange(T t) {...}\n}\n``` \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#8", "metadata": {"Header 1": "Synchronizers", "Header 2": "Blocking Queue", "path": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/synchronizers.mdx#8", "page_content": "A BlockingQueue is a queue that supports operations to wait for the queue to become non-empty when retrieving an element, and wait for space to become available when storing an element. This is especially commonly used in the Product-Consumer pattern. \n \n```java\npublic interface BlockingQueue extends Queue {\nE take() throws InterruptedException;\nvoid put(E e) throws InterruptedException;\n...\n}\n``` \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#1", "metadata": {"Header 1": "Threads", "Header 2": "Processes vs Threads", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#1", "page_content": "A process is an executable program that is loaded into memory. A process has its own logical memory address space allocated by the kernel. As seen in C we can also switch between processes but this is a rather expensive operation. Processes can communicate with each other via signals, interprocess communication - IPC, files or sockets. \nA thread is a single sequential flow that runs in the address space of its process. This also means it shares the same address space with threads of the same process. It does, however, have its personal execution context containing amongst other things the call stack. For comparison threads communicate with each other via shared memory which we will see is a very dangerous but practical thing. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#2", "metadata": {"Header 1": "Threads", "Header 2": "Threading models", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#2", "page_content": "Threading models define how threads are managed. \n- Kernel-Level (1:1): The kernel controls the threads and processes and threads are scheduled to available CPUs by the kernel. This approach is used by most current JVM implementations.\n- User-level (1:n): Threads are implemented and managed/scheduled by a runtime library, so-called green threads. This allows for efficient context switching and application-specific scheduling as the kernel is not involved. This does however mean that different threads can not be scheduled on different processors.\n- Hybrid (m:n): User-level threads are assigned to some kernel threads. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#3", "metadata": {"Header 1": "Threads", "Header 2": "Scheduling", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#3", "page_content": "Scheduling is done by the kernel and is the act of allocating CPU time to threads. It also has to make sure that each CPU processor only has one thread running at any given time."}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#4", "metadata": {"Header 1": "Threads", "Header 2": "Scheduling", "Header 3": "Cooperative", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#4", "page_content": "With cooperative scheduling, the threads decide when they should give up the processor to other threads. Meaning the processor never interrupts a thread to initiate a context switch from one thread to another. This can lead to threads hogging or even completely locking out the processor. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#5", "metadata": {"Header 1": "Threads", "Header 2": "Scheduling", "Header 3": "Preemptive", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#5", "page_content": "With preemptive scheduling, the kernel can interrupt the running thread at any time. This stops threads from unfairly hogging the processor. It is up to the Java implementation but in most implementations, preemptive scheduling is used. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#6", "metadata": {"Header 1": "Threads", "Header 2": "Java Threads", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#6", "page_content": "In Java, a program's entry point is the main function that starts the initial thread, the main thread (non daemon). Java defines the functional interface `Runnable` which should be implemented by any class whose instance is intended to be executed by a thread. \n```java\ninterface Runnable {\nvoid run();\n}\n``` \nThe `Thread` class represents a thread in Java and takes a runnable whilst also implementing the Runnable interface. The `start()` function creates a new thread and then executes the `thread.run()` which executes the passed `runnable.run()` in a separate thread. The start and run functions return immediately as the rest is executed on a separate thread. \n```java\nclass Thread implements Runnable{\nThread(Runnable target){...}\nThread(Runnable target, String name){...}\n\nvoid run(){...}\nvoid start(){...}\nvoid join(){...}\nvoid join(long millis){...}\nstatic void sleep(long millis){...}\nstatic void yield(){...}\nvoid setDaemon(boolean b){...}\nvoid setPriority(int newPriority){...}\n}\n``` \nThere are a few ways you can use the Thread class. You can extend the Thread class and implement the run method which is an easy and simple way to make use of threads. However, it is better to implement runnable separately and pass it to the Thread class as it is a better separation of concerns. You can still access the thread methods by using static imports. Because Runnable is a functional interface you can also use lambdas which is in my opinion the way to go for simple examples. \n```java\nclass ThreadExamples {\n// Extending Thread\nstatic class MyThread extends Thread {\npublic void run() {\nSystem.out.println(\"MyThread running\");\n}\n}\n\n// Anonymous subclass of Thread\nstatic Thread thread = new Thread() {\npublic void run() {\nSystem.out.println(\"Anonymous MyThread running\");\n}\n};\n\n// Implementing Runnable\nstatic class MyRunnable implements Runnable {\npublic void run() {\nSystem.out.println(\"MyRunnable running\");\n}\n}"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#7", "metadata": {"Header 1": "Threads", "Header 2": "Java Threads", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#7", "page_content": "// Implementing Runnable\nstatic class MyRunnable implements Runnable {\npublic void run() {\nSystem.out.println(\"MyRunnable running\");\n}\n}\n\n// Anonymous implementation of Runnable\nRunnable myRunnable = new Runnable() {\npublic void run() {\nSystem.out.println(\"Anonymous MyRunnable running\");\n}\n};\n\n// Lambda runnable\nstatic Runnable lambdaRunnable = () -> System.out.println(\"Lambda Runnable running\");\n\npublic static void main(String[] args) {\nMyThread t1 = new MyThread();\nThread t2 = new Thread(new MyRunnable());\nThread t3 = new Thread(lambdaRunnable);\nThread t4 = new Thread(() -> System.out.println(\"Inline Lambda Runnable running\"));\nt1.start();\nt2.start();\nt3.start();\nt4.start();\n// main waits for all to finish before exiting\n}\n}\n``` \nThe yield function hints to the scheduler that the calling thread is willing to yield its use of the processor, but it can just be ignored by the processor. \nThe join function blocks the calling thread and waits for the thread on which it was called until it terminates. A number of milliseconds can also be passed to the join function which defines the maximum amount of time to wait for the thread to terminate. \nWith the setDaemon function, a thread can be marked as either a daemon or user thread. This function must be called before the thread is started because the type of thread can not be changed whilst it is running. If a process only has demon threads left then the process stops and therefore also the threads. \nThreads can have a priority which is an integer value in the range of 1 to 10 (10 being the highest priority). The JVM is free to implement these priorities which means that they can also be ignored. \n"}}
+{"id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#8", "metadata": {"Header 1": "Threads", "Header 2": "Java Threads", "Header 3": "Exceptions in Java Threads", "path": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx", "id": "../pages/digitalGarden/cs/concurrentParallel/threads.mdx#8", "page_content": "If an exception is thrown in a thread it can be caught and handled inside the thread. However, if the exception is never caught the thread will just terminate. This is why `join()` returns and the main thread can carry on with its work, the exception itself is lost. \n```java\npublic static void main(String[] args) throws Exception {\nThread t = new Thread(() -> {\nint value = 1 / 0;\n});\nt.start();\nt.join();\nSystem.out.println(\"Main continues\");\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/cpp/classes.mdx#1", "metadata": {"Header 1": "Classes", "Header 2": "Constructors", "path": "../pages/digitalGarden/cs/cpp/classes.mdx", "id": "../pages/digitalGarden/cs/cpp/classes.mdx#1", "page_content": "In C++ primitive types don't have constructors so you need to initialize them in the constructor. \n```cpp\n#include \nusing namespace std;\n\nclass Point {\nprivate:\ndouble m_x;\ndouble m_y;\n\npublic:\n// default constuctor\nPoint() = default; // or just Point(){};\nPoint(double x, double y) { // Point object is already initialized\nm_x = x; // lots of copiessssss\nm_y = y;\n}\n};\n\nint main()\n{\nPoint p1(); // default constructor, very bad nothing is initialized\nPoint p2(1, 2);\n}\n``` \nHowever the above example is a bad way of creating a constructor as the object is already initialized and then the values are changed, this leads to lots of member-wise copying unnecessarily used memory. Even worse is the default constructor which leaves the attributes uninitialized because as mentioned the primitives don't have a default constructor."}}
+{"id": "../pages/digitalGarden/cs/cpp/classes.mdx#2", "metadata": {"Header 1": "Classes", "Header 2": "Constructors", "Header 3": "Initializer lists", "path": "../pages/digitalGarden/cs/cpp/classes.mdx", "id": "../pages/digitalGarden/cs/cpp/classes.mdx#2", "page_content": "Instead in modern C++ we use initializer lists which stops the copying and everything bad mentioned above. \n```cpp\n#include \nusing namespace std;\n\nclass Point {\nprivate:\ndouble m_x;\ndouble m_y;\n\npublic:\n// default constuctor\nPoint() : m_x(0), m_y(0) {};\nPoint(double x, double y): m_x(x), m_y(y) {};\n};\n\nint main()\n{\nPoint p1(); // default constructor x and y are 0\nPoint p2(1, 2);\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/cpp/classes.mdx#3", "metadata": {"Header 1": "Classes", "Header 2": "Constructors", "Header 3": "Default parameters", "path": "../pages/digitalGarden/cs/cpp/classes.mdx", "id": "../pages/digitalGarden/cs/cpp/classes.mdx#3", "page_content": "We can improve the above constructor even more by using default parameter values. Because the arguments also don't change and we don't want them to we can add const. \n```cpp\n#include \nusing namespace std;\n\nclass Point {\nprivate:\ndouble m_x;\ndouble m_y;\n\npublic:\nPoint(const double x = 0, const double y = 0): m_x(x), m_y(y) {};\n};\n\nint main()\n{\nPoint p1(); // x and y are 0\nPoint p2(1, 2);\n}\n```"}}
+{"id": "../pages/digitalGarden/cs/cpp/classes.mdx#4", "metadata": {"Header 1": "Classes", "Header 2": "Constructors", "Header 3": "Explicit constructors", "path": "../pages/digitalGarden/cs/cpp/classes.mdx", "id": "../pages/digitalGarden/cs/cpp/classes.mdx#4", "page_content": "You need to be very careful with creating constructors and often have to define a constructor as explicit otherwise something might just implicitly create an object of a certain type. \n```cpp\n#include