From dcb96c3bfbe282a0acdd867e86d5d062f91cb1a0 Mon Sep 17 00:00:00 2001 From: Lucifer Uchiha Date: Fri, 26 Jul 2024 17:19:48 +0200 Subject: [PATCH] pinecone things --- .gitignore | 4 +- chunker/chunks.jsonl | 1574 ------------------------------------- chunker/main.py | 97 --- chunker/requirements.txt | 2 - pinecone/chunks.jsonl | 10 + pinecone/main.py | 176 +++++ pinecone/requirements.txt | 5 + 7 files changed, 194 insertions(+), 1674 deletions(-) delete mode 100644 chunker/chunks.jsonl delete mode 100644 chunker/main.py delete mode 100644 chunker/requirements.txt create mode 100644 pinecone/chunks.jsonl create mode 100644 pinecone/main.py create mode 100644 pinecone/requirements.txt diff --git a/.gitignore b/.gitignore index 613f8e7..2c28aab 100644 --- a/.gitignore +++ b/.gitignore @@ -6,4 +6,6 @@ node_modules .venv venv -*/**/__pycache__ \ No newline at end of file +*/**/__pycache__ + +*/**/.env \ No newline at end of file diff --git a/chunker/chunks.jsonl b/chunker/chunks.jsonl deleted file mode 100644 index 9540736..0000000 --- a/chunker/chunks.jsonl +++ /dev/null @@ -1,1574 +0,0 @@ -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "What is a Digital Garden?", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "A digital garden is a mix between a notebook and a blog, it is a place to share thoughts and cultivate them into a garden.\nIt also allows me to have a place where I can store my notes/summaries/tutorials for my studies. \nThe main difference to a blog is that a blog has articles and publication dates and never changes after it has been\npublished, whereas a digital garden is a place where the written content can be continuously edited and refined. The\nnotes are also very free flowing they can span from just a short cheatsheet to a full set of notes on an entire subject\nwhere you go into every nitty-gritty detail. \nAnother key difference is the navigation. A blog is usually read in chronological order but a digital garden can be read\nin any order you want and uses lots of internal links to connect all the notes into a Network (although this can be\nquite hard to diligently do). \nIf you are interested in learning more about digital gardens I can recommend the following\n[article by Maggie Appleton](https://maggieappleton.com/garden-history).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "How is my Garden Built?", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "The current iteration of my digital garden is built using [Nextra](https://nextra.site/). Nextra is a static site\ngenerator that is built on top of Next.js and MDX. This allows me to write my notes in markdown and also use the MDX\nformat to write JSX in my markdown files. These markdown files are then converted into static HTML files using Next.js\nand can be hosted on any static site hosting service such as [Vercel](https://vercel.com/).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "The Features", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "In this section I briefly go over some of the features that are supported by my digital garden and how to use them.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "The Features", "Header 3": "Markdown", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "Markdown is supported out of the box. Anything that is supported by markdown can be used in the notes. This includes but\nis not limited to: \n- Headers\n- Lists\n- Links\n- Images\n- Code Blocks\n- Tables\n- Blockquotes \nFor a full list of markdown features check out the [Markdown Guide](https://www.markdownguide.org/).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "The Features", "Header 3": "MDX", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "In addition to the normal markdown format, Nextra also supports the MDX format which allows you to write JSX, i.e. react code in a\nmarkdown file. To find out more about MDX check out the [official MDX documentation](https://mdxjs.com/). \n#### Admonitions / Callouts \nAdmonitions aren't included in standard markdown but have become very popular. Recently GitHub has also added support for\nadmonitions in markdown FileSystem, however they call them alerts. \nAdmonitions are very useful to highlight certain text and add a category to the text. I have added a custom component that\nbuilds on nextra's callouts to be able to add custom callout types. To use callouts in a MDX file you can use the following syntax: \n```\n\nThis Is a big scary warning.\n\n``` \nRenders to: \n\nThis Is a big scary warning.\n \nYou can also change the title of the banner: \n```\n\ninfo, warning, error, example, todo\n\n``` \n\ninfo, warning, error, example, todo\n \nThe default callout type uses the websites primary color, a rocket icon and has no title: \n\nThis is a default callout.\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "The Features", "Header 3": "Jupyter Notebooks", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "\nTODO add how the hound works and how to use it.\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "The Features", "Header 3": "LaTeX", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "It has recently become very popular to write LaTeX equations in markdown. Nextra supports this by using [KaTeX](https://katex.org/).\nYou can render LaTeX content either inline between `$\\LaTeX$` $\\LaTeX$ or as a block between `$$I = \\int_0^{2\\pi} \\sin(x)\\,dx$$`: \n$$\nI = \\int_0^{2\\pi} \\sin(x)\\,dx\n$$ \nAnnoyingly Jupyter Notebooks use MathJax to render LaTeX content in the same way instead of KaTeX. This means that KaTeX\nsupports some things and MathJax supports other things. Importantly however is that the Jupyter Notebooks get converted\nto Markdown and therefore in the end it will only be rendered in KaTeX. \nTherefore, if something is written that is supported in MathJax but not in KaTeX it might look okay but in the end,\nit will not be rendered by KaTeX. This leads to [my LaTeX Notation Guideline](./maths/latexGuidelines) to avoid\nconflicts whilst still keeping nice Formulas. \nYou can see what is supported by KaTeX [here,](https://katex.org/docs/supported.html) and you can see what is supported\nby MathJax [here](https://docs.mathjax.org/en/latest/input/tex/macros/index.html).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "The Features", "Header 3": "PlantUML", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "If you ever need to create diagrams and especially UML diagrams, PlantUML is the way to go. I started with Mermaid\nto create UML diagrams but swapped to PlantUML for the additional features and the ability to create custom themes\n(so everything can be minimalist and purple :D). \nTo render PlantUML diagrams the [Remark plugin Simple PlantUML](https://github.com/akebifiky/remark-simple-plantuml) is\nused which uses the official PlantUML server to generate an image and then adds it. \nAn Example can be seen below, on the [official website](https://plantuml.com/) and also on [REAL WORLD PlantUML](https://real-world-plantuml.com/?type=class). \n```plantuml\n@startuml\n\ninterface Command {\nexecute()\nundo()\n}\nclass Invoker{\nsetCommand()\n}\nclass Client\nclass Receiver{\naction()\n}\nclass ConcreteCommand{\nexecute()\nundo()\n}\n\nCommand <|-down- ConcreteCommand\nClient -right-> Receiver\nClient --> ConcreteCommand\nInvoker o-right-> Command\nReceiver <-left- ConcreteCommand\n\n@enduml\n``` \nTo use my custom theme you can use the following line at the beginning of the PlantUML file: \n```\n@startuml\n!theme purplerain from http://raw.githubusercontent.com/LuciferUchiha/georgerowlands.ch/main\n\n...\n\n@enduml\n``` \nHowever, it seems like when using a custom theme There can not be more then one per page? My custom theme also has some processes built in for simple text coloring for example in cases of success, failure etc. \n```plantuml\n@startuml\n!theme purplerain from http://raw.githubusercontent.com/LuciferUchiha/georgerowlands.ch/main\n\nBob -> Alice : normal\nBob <- Alice : $success(\"success: Hi Bob\")\nBob -x Alice : $failure(\"failure\")\nBob ->> Alice : $warning(\"warning\")\nBob ->> Alice : $info(\"finished\")\n\n@enduml\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "My Digital Garden", "Header 2": "How can I Contribute?", "path": "../pages/digitalGarden/index.mdx"}, "page_content": "Do you enjoy the content and want to contribute to the garden by adding some new plants or watering the existing ones?\nThen feel free to make a pull request. There are however some rules to keep in mind before adding or changing content. \n- Markdown filenames and folders are written in camelCase.\n- Titles should follow the\n[IEEE Editorial Style Manual](https://www.ieee.org/content/dam/ieee-org/ieee/web/org/conferences/style_references_manual.pdf).\nThey should also be added to the markdown file and specified in the `_meta.json` which maps files to titles and is also\nresponsible for the ordering.\n- LaTeX should conform with my notation and guideline, if something is not defined there you can of course add it.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Computer Systems", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx"}, "page_content": "This Page is meant as a brief introduction to the types of computer systems there are and how they are made and the issues we have faced with making improvements to our computer systems over the years.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Computer Systems", "Header 2": "Types of Computer Systems", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx"}, "page_content": "Let us first answer the question of what is a computer. What is the difference between a computer and just any other machine. A Computer is a programmable machine. Meaning that it can change its behavior and functionality, unlike other simple machines. \nMost commonly Computers are split up into the following types: \n- Personal Computer, short PC. The PC is the type of computer most people use and think of when talking about a computer. It serves a very general purpose and offers a wide variety of software to solve problems in our day-to-day life. In more recent years this type has also seen the addition of personal mobile devices (PMD) or more commonly known as smartphones and tablets. These devices are meant for the average consumer which makes subjects them to cost/performance tradeoffs.\n- Server computers or also just servers are computers that are usually accessed only over a network (Internet or LAN). Servers are built from the same basic technology as personal computers but with more performance and storage capabilities. Since they are also used by multiple people and are used to communicate between different applications and/or networks they have to be reliable to mitigate downtime.\n- Supercomputers, these computers represent the peak of what can be done with computers and are mainly used for research and academic purposes. You can find out more about the top supercomputers [here](https://www.top500.org/).\n- Embedded computers are the most used computers but people would never think so as they are usually hidden. They have a very wide range of applications and performances for example being part of your car to optimize fuel efficiency down to controlling the temperature in your coffee machine.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Computer Systems", "Header 2": "Components of a Computer", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx"}, "page_content": "\nCPU = Control + Datapath\nMemory\nIO\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Computer Systems", "Header 2": "How are Chips made", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx"}, "page_content": "blbabla silicon and moores law. yield etc. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Computer Systems", "Header 2": "The Power Wall", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx"}, "page_content": "As transistors get smaller, their power density stays constant. power wall and denard scaling\ncant reduce voltage because of noise => bits getting flipped and cant cool \nlead to hift to Multicore Processors \nblabla amhdals law, cant infinetly speeedup there is some limit.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Computer Systems", "Header 2": "Programming a Computer", "path": "../pages/digitalGarden/cs/computerArchitecture/computerSystems.mdx"}, "page_content": "blabla high level, compiler assembler instruciton sets", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "Working with numbers on computer systems is slightly more complex then one would think due to the fact that computers work only with the binary numbers 1 and 0.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Integers", "Header 3": "Unsigned Integers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "with n bits we can represent $2^n$ things. Encoding unsigned integers, i.e integers with no sign so positive numbers is pretty simple. The first bit called the LSB corresponds to $2^0$, the second one $2^1$. If that bit is set to 1 we add the value corresponding to that bit and receive the result. So if we have 32 bits we can represent $2^32$ things, so if we start at 0 we can represent the range from 0 to $2^32-1$. \nThis can also be described mathematically as followed if we denote our binary representation as $B=b_{n-1},b_{n-2},..,b_0$ and the function $D(B)$ which maps the Binary representation to its corresponding value. \n$$\nD(B)= \\sum_{i=0}^{n-1}{b_i \\cdot 2^i}\n$$", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "When we involve signed integers it gets a bit more complex since now we also want to deal with negative numbers. In history there have been a few representations for encoding signed integers which often get forgotten.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "Header 3": "Sign Magnitude", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "The idea for the sign and magnitude representation is a very simple one. You have a bit (the MSB) that represents the sign, 1 for negative, 0 for positive. All the other bits are the magnitude i.e. the value. \n$$\nD(B)= (-1)^{b_{n-1}} \\cdot \\sum_{i=0}^{n-2}{b_i \\cdot 2^i}\n$$ \n \n\n$$\n\\begin{align*}\n0000\\,1010_2 &= 10 \\\\\n1000\\,1010_2 &= -10\n\\end{align*}\n$$\n \nSeems pretty simple. However, there are two different representations for 0 which isn't good since computers often make comparisons with 0. This could potentially double the number of comparisons needed to be made which is one reason why this sign magnitude representation is not optimal.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "Header 3": "One's Complement", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "The idea of the one's complement is also very simple, it is that we want to quickly find the negative number of the same positive value by just flipping all the bits. In other words: \n$$\n-B=\\,\\sim B\n$$ \nAnd mathematically defined: \n$$\nD(B)= -b_{n-1}(2^{n-1}-1) + \\sum_{i=0}^{n-2}{b_i \\cdot 2^i}\n$$ \n \n\n$$\n\\begin{align*}\n0000\\,1010_2 &= 10 \\\\\n1111\\,0101_2 &= -10\n\\end{align*}\n$$\n \nhowever just like the sign magnitude representation the one's complement has the issue of having 2 representations for 0.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Signed Integers", "Header 3": "Two's Complement", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "Finally, we have the representation that is used nowadays, the two's complement. This representation solves the issue of the double representation of 0 whilst still being able to quickly tell if a number is positive or negative. It does however lead to there not being a positive value corresponding to the lowest negative value. \n$$\nD(B)= -b_{n-1}(2^{n-1}) + \\sum_{i=0}^{n-2}{b_i \\cdot 2^i}\n$$ \n \n\n$$\n\\begin{align*}\n0000\\,1010_2 &= 10 \\\\\n1111\\,0110_2 &= -10\n\\end{align*}\n$$\n \nAny easy way to calculate the negative value of a given value with the two's complement representation is the following: \n$$\n\\sim B + 1 \\Leftrightarrow -B\n$$ \n#### Sign Extension \nWhen using the two's complement we do need be aware of something when converting a binary number with $n$ bits to a binary number with $n+k$ bits and it is called sign extension. Put simply for the value of binary number to stay the same we need to extend the sign bit. \n \n\n$$\n\\begin{align*}\n10:&\\, 0000\\,1010_2 \\Rightarrow 0000\\,0000\\,0000\\,1010_2 \\\\\n-10:&\\, 1111\\,0110_2 \\Rightarrow 1111\\,1111\\,1111\\,0110_2\n\\end{align*}\n$$\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "Representing real numbers can be pretty hard as you can imagine since real numbers can be infinite numbers such as $\\pi = 3.14159265358979323846264338327950288...$ but we only have finite resources and bits to represent them for example 4 or 8 bytes. Another problem is that often times when working with real numbers we find ourselves using very small or very large numbers such as $1$ Lightyear $=9'460'730'472'580.8\\,km$ or the radius of a hydrogen atom $0.000000000025\\,m$.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Binary Fractions", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "One way, but not a very good way to represent real numbers is to use binary fractions. Binary fractions are a way to extend the unsigned integer representation by adding a so-called binary/zero/decimal point. To the left of the binary point, we have just like with the unsigned representation the powers of 2. To the right, we now also use the powers of 2 with negative numbers to get the following structure: \n$$\nB = b_{i},b_{i-1},..,b_0\\,.\\,b_{-1},...,b_{-j+1},b_{-j}\n$$ \nAnd Formula: \n$$\nD(B) = \\sum_{k=-j}^{i}{b_k \\cdot 2^k}\n$$ \n \n\n$$\n\\begin{align*}\n5 \\frac{3}{4} &= 0101.1100_2 \\\\\n2 \\frac{7}{8} &= 0010.1110_2 \\\\\n\\frac{63}{64} &= 0.1111110_2\n\\end{align*}\n$$\n \nFrom the above examples we can make 3 key observations the first 2 might already know if you have been programming for a long time. \n- Dividing by powers of 2 can be done with shifting right $x / 2^y \\Leftrightarrow x >> y$\n- Multiply with powers of 2 can be done with shifting left $x \\cdot 2^y \\Leftrightarrow x << y$ \nThis representations does have its limits since we can only represent numbers of the form $\\frac{x}{s^k}$ other numbers such as $\\frac{1}{3}$ have repeating bit representations.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Fixed Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "The fixed-point representation or also called $p.q$ fixed-point representation extends the idea of binary fractions by adding a sign bit making the left part of the binary point the same as the two's complement. The right part is the same fractional part. The number of bits for the integer part (including the sign) bit corresponds to $p$ the number of bits for the fractional part corresponds to $q$, 17.14 being the most popular format. \n$$\nD(P)=-b_p \\cdot 2^p + \\sum_{k=-q}^{p-1}{b_k \\cdot 2^k}\n$$ \n \nThis representation has many pros, it is simple we can use simple arithmetic operations and don't need special floating-point hardware which is why it is commonly used in many low-cost embedded processors. The only con is that we can not represent a wide range of numbers which we will fix with the next and last representation.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "In 1985 the [IEEE standard 754](https://standards.ieee.org/ieee/754/993/) was released and quickly adapted as the standard for so-called floating-point arithmetic. In 1989, William Kahan, one of the primary architects even received the Turing Award, which is like the noble prize for computer science. The floating-point representation builds on the ideas of the fixed-point representation and [scientific notation](../../Mathematik/scientificNotation). \nFloating-point representation consists of 3 parts, the sign bit, and like the scientific notation an exponent and mantissa. \n \nWe most commonly use the following sizes for the exponent and mantissa: \n- Single precision: 8 bits for the exponent, 23 bits for the mantissa making a total of 32 bits with the sign bit.\n- Double precision: 11 bits for the exponent, 52 bits for the mantissa making a total of 64 bits. It doesn't offer much of a wider range then the single precision however, it does offer more precision, hence the name. \nIn 2008 the IEEE standard 754 was revised with the addition of the following sizes: \n- Half precision: 5 bits for the exponent, 10 bits for the mantissa making a total of 16 bits.\n- Quad precision: 15 bits for the exponent, 112 bits for the mantissa making a total of 32 bits. \nWith the rise of artificial intelligence and neural networks, smaller representations have gained popularity for quantization. This popularity introduced the following so-called minifloats consisting of 8 bits in total: \n- E4M3: as the name suggests 4 bits for the exponent and 3 bits for the mantissa.\n- E5M2: 5 bits for the exponent and 2 bits for the mantissa. \nThe brain floating point which was developed by Google Brain is also very popular for AI as it has the same range as single precision due to using the same amount of bits for the exponent but with less precision.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "The brain floating point which was developed by Google Brain is also very popular for AI as it has the same range as single precision due to using the same amount of bits for the exponent but with less precision. \nThe floating-point representation used however normalized values just like the scientific notation. Meaning the mantissa is normalized to the form of \n$$\n1.000010010...110_2\n$$ \nSo, in reality, we are not actually storing the mantissa but only the fraction part which is why it is also commonly referred to as the fraction. This leads to two things, we get an extra bit for free since we imply that the first bit is 1, but we can no longer represent the value 0. We will however see later how we can solve the problem of representing 0. \nWe also do not store the exponent using the two's complement. Instead, we use the so-called biased notation for the simple reason of wanting to compare values quickly with each other. To do this we want a form where the exponent with all zeros $0000\\,0000$ is smaller than the exponent with all ones $1111\\,1111$ which wouldn't be the case when using the two's complement. Instead, we use a bias. To calculate the bias we use the number of bits used to represent the exponent $k$. For single precision $k=8$, the bias for single precision is $127$ calculated using the formula: \n$$\nbias = 2^{k-1}-1\n$$ \n\nNow that we understand the form of the floating-point representation let us look at an example. We want to store the value $2022$ using single precision floating-point. First, we set the sign bit in this case $0$. Then we convert the value to a binary fraction. Then we normalize it whilst keeping track of the exponent. Then lastly we store the fraction part and the exponent + the bias. \n$$\n\\begin{align}\n2022 &= 11111100110._2 \\cdot 2^0 & \\text{Convert to binary fraction} \\\\\n&= 1.1111100110_2 \\cdot 2^{10} & \\text{Shift binary point to normalize} \\\\\nM &= 1.1111100110_2 & \\text{Mantissa} \\\\\nFraction &= 1111100110_2 & \\text{Fraction} \\\\", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "$$\n\\begin{align}\n2022 &= 11111100110._2 \\cdot 2^0 & \\text{Convert to binary fraction} \\\\\n&= 1.1111100110_2 \\cdot 2^{10} & \\text{Shift binary point to normalize} \\\\\nM &= 1.1111100110_2 & \\text{Mantissa} \\\\\nFraction &= 1111100110_2 & \\text{Fraction} \\\\\nE &= 10 & \\text{Exponent} \\\\\nExp &= E + bias = 10 + 127 = 1000\\,1001_2 & \\text{Biased Exponent}\n\\end{align}\n$$ \n| Sign | Exponent | Fraction |\n| ---- | --------- | ---------------------------- |\n| 0 | 1000 1001 | 1111 1001 1000 0000 0000 000 |\n \n#### Denormalized values \nAs mentioned above we can't represent the value $0$ using the normalized values. For this, we need to use denormalized values or also often called subnormal. For this, in the case of single precision, we reserve the exponent that consists of only zeros so has the biased value $0$ and therefore the exponent $1-bias$, for single precision this would be $-126$. If the fraction also consists of all zeros then we have a representation for the value $0$. If it is not zero then we just have evenly distributed values close to 0. \n\n| Value | Sign | Exponent | Fraction |\n| ------------------------------------------------- | ---- | --------- | ---------------------------- |\n| 0 | 0 | 0000 0000 | 0000 0000 0000 0000 0000 000 |\n| -0 | 1 | 0000 0000 | 0000 0000 0000 0000 0000 000 |\n| $0.5 \\cdot 2^{-126} \\approx 5.877 \\cdot 10^{-39}$ | 0 | 0000 0000 | 1000 0000 0000 0000 0000 000 |\n| $0.99999 \\cdot 2^{-126}$ | 0 | 0000 0000 | 1111 1111 1111 1111 1111 111 |\n \n#### Special Numbers", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "| $0.5 \\cdot 2^{-126} \\approx 5.877 \\cdot 10^{-39}$ | 0 | 0000 0000 | 1000 0000 0000 0000 0000 000 |\n| $0.99999 \\cdot 2^{-126}$ | 0 | 0000 0000 | 1111 1111 1111 1111 1111 111 |\n \n#### Special Numbers \nFor some cases we want to be able to store some special values such as $\\infty$ if we do $1.0 / 0.0$ or $NaN$ when doing $\\sqrt{-1}$ or $\\infty - \\infty$. Just like with solving the issue of representing $0$, to represent special values we can reserve an exponent, in the case of single precision this is the exponent consisting of only ones. If the fraction only consists of zeros then it represents the value $\\infty$ otherwise if the fraction is not all zeros it represents $NaN$. \n| Value | Sign | Exponent | Fraction |\n| --------- | ---- | --------- | ---------------------------- |\n| $\\infty$ | 0 | 1111 1111 | 0000 0000 0000 0000 0000 000 |\n| $-\\infty$ | 1 | 1111 1111 | 0000 0000 0000 0000 0000 000 |\n| $NaN$ | 0 | 1111 1111 | 1000 0000 0000 0000 0000 000 |\n| $NaN$ | 1 | 1111 1111 | 1111 1111 1111 1111 1111 111 | \nFor other representations such as the E4M3, E5M2 or bfloat16 the handling of special numbers can be different. This comes down to there being less bits and therefore each bit having more meaning so reserving an entire exponent range just to represent $NaN$ would be a big waste: \n|| E4M3 | E5M2 |\n| ------------------ | ---------------- | ------------------------ |\n| $-\\infty / \\infty$ | N/A | $S\\,11111\\,00_2$ |\n| $NaN$ | $S\\,1111\\,111_2$ | $S\\,11111\\,{01,10,11}_2$ |\n| $-0/0$ | $S\\,0000\\,000_2$ | $S\\,00000\\,00_2$ | \n#### Precision \nAs mentioned at the beginning of the floating-point section Real numbers are in theory infinite however we can not represent an infinite amount of numbers with a finite number of bits. Below you can see an estimated visualization of what values can actually be represented. \n \nAt a closer look, we can also see how the representations are distributed with the values close to zero being very precise. \n \nThis issue can however cause problems of imprecision if a certain number can not be represented and is rounded to the closest number that can be represented. For example in C we can do the following: \n```c\n#include \nint main ()\n{\ndouble d;\nd = 1.0 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1;\nprintf (“d = %.20f\\n”, d); // Not 2.0, outputs 2.00000000000000088818\n}\n``` \n#### Rounding \nThe IEEE standard 754 defines four rounding modes: \n- Round-up\n- Round-down\n- Round-toward-zero, i.e truncate, which is commonly done when converting from integer to floating point.\n- Round-to-even, the most common but also the most complicated of the four modes. \nI will not go into detail of the first three modes as they are self-explanatory. Let us first look at why we need to round-to-even. The reason is actually pretty simple, normal rounding is not very fair. \n\n$$\n\\begin{align*}\n& &0.5+1.5+2.5+3.5 &= 8 \\\\\n\\text{Rounded: }& &1+2+3+4 &= 10 \\\\\n\\text{Round-to-even: }& &0 + 2 + 2 + 4 &= 8\n\\end{align*}\n$$\n \n\nThis part is not correct.\n \nWhen working with round-to-even we need to keep track of 3 things: \n- Guard bit: The LSB that is still part of the fraction.\n- Round bit: The first bit that exceeds the fraction.\n- Sticky bit: A bitwise OR of all the remaining bits that exceed the fraction. \nSo if we only have a mantissa of 4 bits, i.e a fraction with 3 bits then it could look like this: \n \nNow we have 3 cases: \n- If $GRS=0xx$ we round down, i.e do nothing since the LSB is already $0$.\n- If $GRS=100$ this is a so-called tie, if the bit before the guard bit is $1$ we round the mantissa up otherwise we round down i.e set the guard bit to $0$", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Working with Numbers", "Header 2": "Real Numbers", "Header 3": "Floating Points", "path": "../pages/digitalGarden/cs/computerArchitecture/workingWithNumbers.mdx"}, "page_content": "Now we have 3 cases: \n- If $GRS=0xx$ we round down, i.e do nothing since the LSB is already $0$.\n- If $GRS=100$ this is a so-called tie, if the bit before the guard bit is $1$ we round the mantissa up otherwise we round down i.e set the guard bit to $0$\n- For all other cases $GRS=110$, $GRS=101$ and $GRS=111$ we round up. \n\nAfter rounding, you might have to normalize and round again for example if we have $1.1111\\,1111|11$ with $GRS=111$ and Biased exponent $128$, i.e $2^1$. We have to round up and get $11.0000\\,0000$ therefore we need to increase the exponent by $1$ to normalize again. This also means that after rounding we can produce a over or underflow to infinity.\n \n#### Addition/Subtraction \n#### Multiplication", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Memory Hierarchy", "path": "../pages/digitalGarden/cs/computerArchitecture/memoryHierarchy.mdx"}, "page_content": "\nTo do about caches registers misses etc.\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "What is RISC-V?", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx"}, "page_content": "RISC-V is an open standard instruction set architecture that has been developed at the University of California, Berkeley since 1981 and is based on the established RISC principles which we will see when diving deeper.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "What is RISC-V?", "Header 2": "CISC vs RISC", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx"}, "page_content": "Up until about 1986 most chip manufacturers were using the CISC (Complex Instruction Set Computers) Architecture, the most common example of this being the Intel x86 ISA which is widely used nowadays. However, they realized that it makes building the chips more complicated and slows down potential improvements. This brought on the switch to RISC (Reduced Instruction Set Computers) which focuses on having a small number of simple instructions and then letting the compilers resolve complexity. Some of the most common examples of RISC are MIPS, AMD ARM and the open-source RISC-V which is what we will be looking at. \nHowever, you might have realized that most computers that you interact with use x86, doesn't that mean that you aren't getting the best performance that you could? This is actually not true, almost all chips nowadays use the RISC architecture, including x86 chips. But I just said that x86 uses CISC? This is true for the early x86 chips, the x86 chips nowadays are hybrid chips. They support CISC instructions for backward compatibility as a lot of devices were already using CISC, however, inside the chips they convert the CISC instructions to RISC instructions and execute them, which makes them have a \"RISC\" core.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "What is RISC-V?", "Header 2": "Extensions", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx"}, "page_content": "RISC-V aims to be as lightweight as possible which is why it allows for extensions to be added for certain functionalities. This allows chip manufacturers to only add what they need and not have instructions that they never intend to use or support. We will mainly be focusing on the `RV32IG` variant which is equivalent to `RV32IMAFD`. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "What is RISC-V?", "Header 2": "Register Layout", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/riscV.mdx"}, "page_content": "RISC-V has 32 (or 16 in the embedded variant) integer registers and with the floating-point extension another separate 32 floating-point registers. Since our focus is on the 32-bit variation each register can store 32 bits. These registers are essential to the CPU as it can only work with data that is in a register it can not work on data in main memory. So if we want to manipulate data that is in the main memory we need to first transfer the data from the main memory to a register. \nCertain registers have restrictions or should be used in a certain way. Most notable is that the first register will always store the value 0. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Procedure Calls", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx"}, "page_content": "A procedure or function is one of the key building blocks for programmers as it allows them to create understandable and reusable code. It also a way to add abstraction and simplify a program. In simple a procedure works as follows: \n1. Put the parameters in a place the procedure (callee) can access them.\n2. Transfer control to the procedure.\n3. Acquire the storage resources needed for the procedure.\n4. Perform the task.\n5. Put the result of the task in a place the caller can access them.\n6. Return control to the caller. \nFor the first and fifth point, we have the registers `x10-x17`. So that we know where to return to in step 6 we store the caller address in `x1`, this would be done when transferring control to the procedure with the `jal x1, Label` instruction.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Procedure Calls", "Header 2": "Using More Registers", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx"}, "page_content": "But what if the 8 registers for the arguments are not enough to complete our task. We can use other registers as long as we clean up after ourselves, meaning we can spill registers to memory and then restore the registers before returning control. This leads us to the idea of the stack. In RISC-V we keep track of a stack pointer in `x2` and push and pop data to the stack. Important here is however that the stack grows from high to low addresses meaning when updating the stack pointer we need to subtract. \nFor this we also remember that the registers `x5-x7` and `x28-x31` are temporary registers and do not need to be restored before returning control but the registers `x8-x9` and `x18-x27` are saved registers and do need to be restored. \n\nIn the below example we could just use the temporary registers to store the temporary values but instead, we will spill some registers to the stack to demonstrate how this could be done. \n```c\n// g in x10, h in x11, i in x12, j in x13\nint leaf( int g, int h, int i, int j) {\nint f;\nf = (g + h) – (i + j);\nreturn f;\n}\n``` \n```assembly\nleaf:\naddi sp, sp, -12 # make space on stack for 12 bytes 3x 32 bits\nsw x5,8(sp) # save x5\nsw x6,4(sp) # save x6\nsw x7,0(sp) # save x7\nadd x5,x10,x11 # x5 <- g + h\nadd x6,x12,x13 # x6 <- i + j\nsub x7,x5,x6 # x7 <- x5 - x6\naddi x10,x7,0 # write result to x10 <- x7\nlw x7,0(sp) # restore x7\nlw x6,4(sp) # restore x6\nlw x5,8(sp) # restore x5\naddi sp,sp,12 # adjust stack\njalr x0,0(x1) # return to caller\n```\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Procedure Calls", "Header 2": "Using More Registers", "Header 3": "Nested Procedures", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/procedureCalls.mdx"}, "page_content": "Procedures that do not call other procedures are called leaf procedures. But these are very rarely seen in programs, much more often we see nested procedures or even recursive procedures which need a lot of care when working with registers. For example, imagine the Procedure $A$ is called and the argument 3 is stored in `x10` and return address in `x1`. If $A$ then wants to call the procedure $B$ the argument in `x10` and return address in `x1` must be overwritten. So to prevent these collisions we must carefully push data to the stack and retrieve it again at a later time. \nTo aid this tricky task of keeping track of the local data of a procedure some RISC-V compilers use a frame pointer `fp` which is stored in the register `x8`. As the stack pointer can always change the frame pointer offers a stable base register for local memory references. \n \n\n```c\nint fact(int n)\n{\nif (n < 1)\nreturn 1;\nelse\nreturn n * fact(n-1);\n}\n``` \n```assembly\nfact:\naddi sp, sp, -8 # make space for 8 bytes\nsw x1, 4(sp) # save return address\nsw x10, 0(sp) # save n\naddi x11, x10, -1 # x11 <- n - 1\nbge x11, zero, L1 # if (x11 >= 0), goto L1\naddi x10, zero, 1 # x10 <- 1 (retval)\naddi sp, sp, 8 # adjust stack\njalr zero, 0(x1) # return\nL1:\naddi x10, x10, -1 # x10 <- n - 1\njal x1, fact # call fact(n-1)\naddi t1, x10, 0 # t1 <- fact(n-1)\nlw x10, 0(sp) # restore n\nlw x1, 4(sp) # restore return address\naddi sp, sp, 8 # adjust stack pointer\nmul x10, x10, t1 # x10 <- n * t1 (retval)\njalr zero, 0(x1) # return\n```\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Pseudo Instructions", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/pseudoInstructions.mdx"}, "page_content": "As I have already mentioned multiple times, some RISC-V implementations also offer pseudo instructions which are like aliases for other instructions but make the assembly code easier to read and understand. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Data Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx"}, "page_content": "Just using registries to store data is not enough which is why we also have main memory and secondary memory. Main memory is especially useful when working with composite data such as data structures or dynamic data. \nAs mentioned previously we can not directly work on data that is stored in memory, the CPU can only work on data that is in a registry. This leads us to load and store data between the registries and the main memory. \nEach byte in memory has an address. For composite data, RISC-V uses the little endian byte ordering meaning that the LSB byte is at the smallest address. \nRISC-V defines a word as data that consists of 32 bits this corresponds to the size of the registry and is the most common size to read and write to and from memory. However, we can also only read a byte which is useful since ASCII only uses a byte. RISC-V also supports reading a so-called halfword which corresponds to 16 bits which is useful when working with Unicode characters. \nWe do however need to keep in mind that in memory we only store the value, no context. So if we want a word to be handled like an unsigned integer we also need to specify that otherwise, it will treat it by default as a signed integer. \n| Instruction | Type | Example | Meaning |\n| ---------------------- | ---- | -------------------- | ------------------------------------------------ |\n| Load word | I | `lw rd, imm12(rs1)` | `R[rd] = Mem4[R[rs1] + SignExt(imm12)]` |\n| Load halfword | I | `lh rd, imm12(rs1)` | `R[rd] = SignExt(Mem2[R[rs1] + SignExt(imm12)])` |\n| Load byte | I | `lb rd, imm12(rs1)` | `R[rd] = SignExt(Mem1[R[rs1] + SignExt(imm12)])` |\n| Load word unsigned | I | `lwu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem4[R[rs1] + SignExt(imm12)])` |\n| Load halfword unsigned | I | `lhu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem2[R[rs1] + SignExt(imm12)])` |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Data Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx"}, "page_content": "| Load word unsigned | I | `lwu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem4[R[rs1] + SignExt(imm12)])` |\n| Load halfword unsigned | I | `lhu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem2[R[rs1] + SignExt(imm12)])` |\n| Load byte unsigned | I | `lbu rd, imm12(rs1)` | `R[rd] = ZeroExt(Mem1[R[rs1] + SignExt(imm12)])` |\n| Store word | S | `sw rs2, imm12(rs1)` | `Mem4[R[rs1] + SignExt(imm12)] = R[rs2]` |\n| Store halfword | S | `sh rs2, imm12(rs1)` | `Mem2[R[rs1] + SignExt(imm12)] = R[rs2](15:0)` |\n| Store byte | S | `sb rs2, imm12(rs1)` | `Mem1[R[rs1] + SignExt(imm12)] = R[rs2](7:0)` |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Data Transfer Operations", "Header 2": "Loading With Pointers", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx"}, "page_content": "Pointers in C are nothing else but memory addresses which means we can also load data from and to them. The most simple use of pointers is to swap to values: \n```c\n// x in a0, y in a1\nvoid swap(int *x, int *y)\n{\nint temp_x = *x;\nint temp_y = *y;\n*x = temp_y;\n*y = temp_x;\n}\n``` \nAnd as we can see we can use addresses stored in registries to load and write data: \n```assembly\nlw a4, 0(a0)\nlw a5, 0(a1)\nsw a5, 0(a0)\nsw a4, 0(a1)\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Data Transfer Operations", "Header 2": "Loading Sequential Data", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/dataTransfer.mdx"}, "page_content": "When reading sequential data we do need to keep in mind that each address only corresponds to a byte. This leads us to make \"jumps\" of size 4. \n\n```c\n// h in x21, base address of A in x22\nA[9] = h + A[8]\n``` \n```assembly\nlw x9, 32(x22)\nadd x9, x21, x9\nsw x9, 46(x22)\n```\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Control Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx"}, "page_content": "When programming we often find ourselves using control structures like if and else this creates branches in our program where we either go down one or the other branch. RISC-V offers so-called branch instructions which in most cases take 2 operands and a Label to jump to after checking the condition. Labels are not some magic keywords, they are just an offset off the program counter, PC that is automatically handled by the assembler. \n| Instruction | Type | Example | Meaning |\n| ------------------------------------- | ---- | ---------------------- | ------------------------------------------------------- |\n| Branch equal | SB | `beq rs1, rs2, imm12` | `if (R[rs1] == R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch not equal | SB | `bne rs1, rs2, imm12` | `if (R[rs1] != R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch greater than or equal | SB | `bge rs1, rs2, imm12` | `if (R[rs1] >= R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch greater than or equal unsigned | SB | `bgeu rs1, rs2, imm12` | `if (R[rs1] >=u R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch less than | SB | `blt rs1, rs2, imm12` | `if (R[rs1] < R[rs2]) pc = pc + SignExt(imm12 << 1)` |\n| Branch less than unsigned | SB | `bltu rs1, rs2, imm12` | `if (R[rs1] < u R[rs2]) pc = pc + SignExt(imm12 << 1)` | \nIn RISC-V you might notice that there is no greater then or less than or equal. This is because we can emulate these by just switching the operands, however, most CPUs have pseudo instructions to make the assembly code more readable. \n\n```c\n// i in x22, j in x23, f in x19, g in x20, h in x21\nif (i == j)\nf = g + h;\nelse\nf = g – h;\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Control Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx"}, "page_content": "\n```c\n// i in x22, j in x23, f in x19, g in x20, h in x21\nif (i == j)\nf = g + h;\nelse\nf = g – h;\n``` \nIn the code below we can also see a so-called unconditional branch meaning we always jump to the given Label. This unconditional branch makes us of the register `x0` always holding the value 0. \n```assembly\nbne x22, x23, L1\nadd x19, x20, x21\nbeq x0, x0, Exit # unconditional", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Control Transfer Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx"}, "page_content": "L1:\nsub x19, x20, x21\nExit:\n```\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Control Transfer Operations", "Header 2": "Basic Blocks", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx"}, "page_content": "A basic block is a small building block for a program. It is a sequence of instructions that has no branch calls except for at the end and no has no branch target apart from at the beginning. A goal of the compiler to make as many big basic blocks as it can as this is better for optimization and reusability. \n\nLet us compare to different assembler outputs for the same code and look at their basic blocks. Our code does the following: \n```c\nint fact_while (int x) {\nint result = 1;\nwhile (x > 1) {\nresult *= x;\nx = x – 1;\n}\nreturn result;\n}\n``` \nIt is common to rewrite loops as goto commands when trying to convert high-level code to assembler code. \n```c\nint fact_while (int x) {\nint result = 1;\nLoop:\nif (x <= 1) goto Exit;\nresult = result * x;\nx = x – 1;\ngoto Loop;\nExit:\nreturn result;\n}\n``` \n```assembly\nfact_while:\naddi a5, a0, 0 # a5 = x (x)\naddi a0, zero, 1 # a0 = 1 (result)\nLoop:\naddi a4, zero, 1 # a4 = 1\nble a5, a4, Exit # if (x <= 1) goto Exit\nmul a0, a0, a5 # result *= x\naddi a5, a5, -1 # x = x – 1\nbeq zero, zero, Loop # goto Loop\nExit:\n``` \nThe assembly code above has 3 small basic blocks but if we convert the C code to this structure we can decrease the amount of basic blocks and increase their size. \n```c\nint fact_while2 (int x) {\nint result = 1;\nif (x <= 1) goto Exit;\nLoop:\nresult = result * x;\nx = x – 1;\nif (x != 1) goto Loop;\nExit:\nreturn result;\n}\n``` \n```assembly\nfact_while2:\naddi a5, a0, 0 # a5 = x (x)\naddi a4, zero, 1 # a4 = 1\naddi a0, zero, 1 # a0 = 1 (result)\nble a5, a4, Exit # if (x <= 1) goto Exit\nLoop:\nmul a0, a0, a5 # result *= x\naddi a5, a5, -1 # x = x – 1\nbne a5, a4, Loop # if (x != 1) goto Loop\nExit:\n```\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Control Transfer Operations", "Header 2": "Target Adressing", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/controlTransfer.mdx"}, "page_content": "When jumping using the branch instructions most jumps are not very far. As mentioned before the label is like an immediate offset meaning it can be up to 12 bits long. If we want to jump further we can use one of the commands below. The `jal` instruction stands for jump and link, we jump using the passed offset which can now be 20 bits long. We also store the current PC i.e the return address into the corresponding `rd` register. If we want to jump even further then we can load a large immediate into a temporary register using the `lui` instruction and then add the remaining 12 bits and jump at the same time using the `jalr` instructions which also lets us read the offset from a register. \n| Instruction | Type | Example | Meaning |\n| ------------------------------------- | ---- | ---------------------- | ------------------------------------------------------- |\n| Jump and link | UJ | `jal rd, imm20` | `R[rd] = pc + 4; pc = pc + SignExt(imm20 << 1)` |\n| Jump and link register | I | `jalr rd, imm12(rs1)` | `R[rd] = pc + 4; pc = (R[rs1] + SignExt(imm12)) & (~1)` | \n\nWe can also use the `jal` instruction as an unconditional branch by using the zero register as the return address, which is the same as discarding it: \n```assembly\njal x0, Label\n```\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Arithmetic and Logical Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx"}, "page_content": "Arithmetic and logical operations are some of the key building blocks for writing any program as almost any functionality boils down to them.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Arithmetic Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx"}, "page_content": "In RISC-V all arithmetic operations have the same form, two sources (b and c) and one destination (a). Later on, we will learn more forms of operations and also how these operations are encoded to machine code, i.e to binary digits. \n```assembly\nadd x20, x21, x20\n``` \nThis is in aid of the first design principle of RISC-V \n> Simplicity favors regularity. \n\nIf we have the following C code \n```c\n// f in x19, g in x20, h in x21\n// i in x22, j in x23\nf = (g + h) – (i + j);\n``` \nand we compile it we can expect that the following RISC-V code will be assembled. \n```assembly\nadd x5, x20, x21\nadd x6, x22, x23\nsub x19, x5, x6\n``` \nHere we make use of the temporary registers `x5` and `x6`.\n \nWe will see what the immediate instructions are for further down. \n| Instruction | Type | Example | Meaning |\n| -------------------------------- | ---- | ---------------------- | ------------------------------------------- |\n| Add | R | `add rd, rs1, rs2` | `R[rd] = R[rs1] + R[rs2]` |\n| Subtract | R | `sub rd, rs1, rs2` | `R[rd] = R[rs1] – R[rs2]` |\n| Add immediate | I | `addi rd, rs1, imm12` | `R[rd] = R[rs1] + SignExt(imm12)` |\n| Set less than | R | `slt rd, rs1, rs2` | `R[rd] = (R[rs1] < R[rs2])? 1 : 0` |\n| Set less than immediate | I | `slti rd, rs1, imm12` | `R[rd] = (R[rs1] < SignExt(imm12))? 1 : 0` |\n| Set less than unsigned | R | `sltu rd, rs1, rs2` | `R[rd] = (R[rs1] Make the common case fast. \nImmediate operands are faster as they avoid a load instruction. However, due to the way instructions are encoded we can only use constants that use up to 12 bits. However, later on, we will see how we can work with larger constants. \n```assembly\naddi x22, x22, 4\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Logical Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx"}, "page_content": "We also often find ourselves manipulating or working with bits which is what the logical operations are for. They are in most high-level programming languages the same with the most common exception being for the arithmetic or logical shift right operation. The key difference between these 2 is that the arithmetic version fills the left with zeros where as the logical version fills it with the sign bit resulting in a sign-extension to preserve the decimal value. \n| Instruction | Type | Example | Meaning |\n| -------------------------------- | ---- | --------------------- | --------------------------------------- |\n| AND | R | `and rd, rs1, rs2` | `R[rd] = R[rs1] & R[rs2]` |\n| OR | R | `or rd, rs1, rs2` | `R[rd] = R[rs1] | R[rs2]` |\n| XOR | R | `xor rd, rs1, rs2` | `R[rd] = R[rs1] ^ R[rs2]` |\n| AND immediate | I | `andi rd, rs1, imm12` | `R[rd] = R[rs1] & SignExt(imm12)` |\n| OR immediate | I | `ori rd, rs1, imm12` | `R[rd] = R[rs1] | SignExt(imm12)` |\n| XOR immediate | I | `xori rd, rs1, imm12` | `R[rd] = R[rs1] ^ SignExt(imm12)` |\n| Shift left logical | R | `sll rd, rs1, rs2` | `R[rd] = R[rs1] << R[rs2]` |\n| Shift right arithmetic | R | `sra rd, rs1, rs2` | `R[rd] = R[rs1] >> R[rs2] (arithmetic)` |\n| Shift right logical | R | `srl rd, rs1, rs2` | `R[rd] = R[rs1] >> R[rs2] (logical)` |\n| Shift left logical immediate | I | `slli rd, rs1, shamt` | `R[rd] = R[rs1] << shamt` |\n| Shift right logical immediate | I | `srli rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (logical` |\n| Shift right arithmetic immediate | I | `srai rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (arithmetic)` |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Logical Operations", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx"}, "page_content": "| Shift right logical immediate | I | `srli rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (logical` |\n| Shift right arithmetic immediate | I | `srai rd, rs1, shamt` | `R[rd] = R[rs1] >> shamt (arithmetic)` | \nIf we look at the RISC-V logical operations there isn't anything special apart from there not being a NOT operation. This is because it can be simply implemented by using the XOR operation which sets a bit to 1 if the bits are different and otherwise a 0. To be more precise we XOR with the value that only consists of positive bits to simulate a NOT operation. However, we will come across pseudo instructions where there will be a NOT operation.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Operations With Large Constants", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx"}, "page_content": "If we want to work with constants larger than 12 bits we need to do use the following instruction: \n```assembly\nlui x19, 0x003D0\n``` \nThis instruction stands for load upper immediate and allows us to load the 20 most significant bits into a registry. The 12 remaining bits will be set to 0 but we can also set these by either adding or using an OR operation. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Arithmetic and Logical Operations", "Header 2": "Assembly Optimization", "path": "../pages/digitalGarden/cs/computerArchitecture/riscV/arithmeticLogical.mdx"}, "page_content": "One of the main goals of a compiler is to optimize the program when writing the assembly code. \n```c\n// x in a0, y in a1\nint logical (int x, int y) {\nint t1 = x ^ y;\nint t2 = t1 >> 17;\nint mask = (1 << 8) – 7;\nint rval = t2 & mask;\nreturn rval;\n}\n``` \n```assembly\nxor a0, a0, a1 # a0 = x ^ y (t1)\nsrai a0, a0, 17 # a0 = t1 >> 17 (t2)\nandi a0, a0, 249 # a0 = t2 & ((1 << 8) – 7)\n``` \nIn the above example, we can see that a few simple optimizations have been made: \n- Because x is only needed once we can use its registry to store the result of the first line instead of having to use a separate temporary registry\n- The calculation of the mask only consists of constants, which means it can be calculated at runtime. This results in the last two statements being combined into one instruction. \n```c\n// x in a0, y in a1, z in a2\nint arith (int x, int y, int z) {\nint t1 = x + y;\nint t2 = z + t1;\nint t3 = x + 4;\nint t4 = y * 48;\nint t5 = t3 + t4;\nint rval = t2 - t5;\nreturn rval;\n}\n``` \n```assembly\nadd a5, a0, a1 # a5 = x + y (t1)\nadd a2, a5, a2 # a2 = t1 + z (t2)\naddi a0, a0, 4 # a0 = x + 4 (t3)\nslli a5, a1, 1 # a5 = y * 2\nadd a1, a5, a1 # a1 = a5 + y\nslli a5, a1, 4 # a5 = a1 * 16 (t4)\nadd a0, a0, a5 # a0 = t3 + t4 (t5)\nsub a0, a2, a0 # a0 = t2 – t5 (rval)\n``` \nIn this example the assembly code is actually longer then the C code. However, it has been optimzed, length of code does not correspond to efficiency. To be more precise the multiplication has been optimized because multiplicaitons are very slow. So instead of multiplying the compiler tries to make use of bit shifts which are much fast. So `y * 48` becomes `(3y) << 4`. Another example of this would be replacint `7 * x` with `8 * x - x` which can be translated to `(x << 3) - x`.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Template method", "path": "../pages/digitalGarden/cs/patterns/templateMethod.mdx"}, "page_content": "The intent of the template method pattern is to define a skeleton of an algorithm in the superclass but lets subclasses override specific steps of the algorithm without changing itsstructure.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Template method", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/templateMethod.mdx"}, "page_content": "```mermaid\nclassDiagram\nAbstractClass <|-- ConcreteClass1\nAbstractClass <|-- ConcreteClass2\nAbstractClass <|-- ConcreteClass3\nclass AbstractClass{\n+algorithm()\n+step1()\n+step2()\n}\nclass ConcreteClass1{\n+step1()\n+step2()\n}\nclass ConcreteClass2{\n+step1()\n+step2()\n}\nclass ConcreteClass3{\n+step2()\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Composite", "path": "../pages/digitalGarden/cs/patterns/composite.mdx"}, "page_content": "The intent of the composite pattern is to represent a recursive tree like structures where individual objects or compositions of objects should be treated uniformly. The most common example is your file structure you have folders and files and folders with files inside other folders etc. You can the perform operations like delete or move on either a single file or folder.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Composite", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/composite.mdx"}, "page_content": "```mermaid\nclassDiagram\nComponentInterface <--o Composite\nComponentInterface <|-- Leaf\nComponentInterface <|-- Composite\nclass ComponentInterface{\n+execute()\n}\nclass Leaf{\n+execute()\n}\nclass Composite{\n-children: Component[]\n+add()\n+remove()\ngetChildren()\nexecute() // delegates all work to children\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Composite", "Header 2": "Things to be aware of", "Header 3": "Placement of managing functions", "path": "../pages/digitalGarden/cs/patterns/composite.mdx"}, "page_content": "There are 2 possibilities to place the managing functions(add, remove etc.) either in the Component or in the composite. If you place them in the Composite (Safe) there is a clear separation of tasks and are only defined where they are usable, however you might have to make type casts. If you place them in the Component (Transparent) you have a unified look however have to provide the functionality for all components which might not necessarly make sense.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Composite", "Header 2": "Things to be aware of", "Header 3": "No cycles", "path": "../pages/digitalGarden/cs/patterns/composite.mdx"}, "page_content": "To make sure that one element is not in 2 composites or that there are cycles (a composite is in a higher composite but the higher composite is also in the lower composite). We can add a flag to the abstract class Component to solve this. When adding to a composite we can then make the following checks: \n```java\npublic void addFigure(Figure f) {\nif (f.contained)\nthrow new IllegalArgumentException();\nif (contains(f, this)) {\nthrow new IllegalArgumentException();\n}\nfigures.add(f);\nf.contained = true;\n}\n\nprivate boolean contains(Figure g1, GroupFigure g2) {\nif (g1 == g2) {\nreturn true;\n} else if (g1 instanceof GroupFigure) {\nfor (Figure f : ((GroupFigure) g1).figures) {\nif (contains(f, g2))\nreturn true;\n}\n} return false;\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Strategy", "path": "../pages/digitalGarden/cs/patterns/strategy.mdx"}, "page_content": "The intent of the strategy pattern is to be able to define a family of algorithms that are interchangable and we can easly add more algorithms if needed. So we also want to be able to change behavior just like in the state pattern. For example we want to support multiple different de/encryption methods. If the algorithm only changes based in its parameters we are not speaking of the strategy pattern.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Strategy", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/strategy.mdx"}, "page_content": "We can see that the Class Diagram is very similiar to that of the state pattern. Importantly here is that the interface is powerful enough to support all current algorithms and also those in the future. \n```mermaid\nclassDiagram\nStrategyInterface <--o Context\nStrategyInterface <|-- VariantA\nStrategyInterface <|-- VariantB\nclass Context{\n}\nclass StrategyInterface{\nalgorithm()\n}\nclass VariantA{\nalgorithm()\n}\nclass VariantB{\nalgorithm()\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Strategy", "Header 2": "Example", "path": "../pages/digitalGarden/cs/patterns/strategy.mdx"}, "page_content": "```java\npublic class SecureChannel{\npublic interface Algorithm{\npublic int[] encrypt(byte[] key, int[] plain);\npublic int[] decrypt(byte[] key, int[] encrypted);\n}\nprivate Algorithm algorithm;\npublic void setAlgorithm(Algorithm algorithm) {\nif (algorithm == null) throw new IllegalArgumentException();\nthis.algorithm= algorithm;\n}\npublic void send(byte[] key, int[] plain) {\nwrite(algorithm.encrypt(key, plain));\n}\npublic int[] receive(byte[] key) {\nreturn algorithm.decrypt(key, read());\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Prototype", "path": "../pages/digitalGarden/cs/patterns/prototype.mdx"}, "page_content": "The intent of the prototype pattern is to be able to create new objects by cloning/copying prototype objects.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Prototype", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/prototype.mdx"}, "page_content": "```mermaid\nclassDiagram\nPrototypeInterface <|-- ConcretePrototype\nConcretePrototype <|-- SubClassPrototype\nclass PrototypeInterface{\nclone()\n}\nclass ConcretePrototype{\nclone()\n}\nclass SubClassPrototype{\nclone()\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Prototype", "Header 2": "Solutions", "Header 3": "Cloning based on Object.clone()", "path": "../pages/digitalGarden/cs/patterns/prototype.mdx"}, "page_content": "In java the following clone method is defined. \n```java\nclass Object {\nprotected Object clone() throws CloneNotSupportedException\n...\n}\n}\n``` \nThe protected visibility means it can not be invoked on objects of static type and can only be invoked if clone is overridden in a subclass with a suitable visibility. \n1. It is checked whether the class implements interface Cloneable (which is only a marker interface). If this is not the case, a CloneNotSupportedException is thrown.\n2. A new instance is created, i.e. as much memory as used by the original object is allocated however no constructor is invoked.\n3. Instead, the memory of the original object is copied byte by byte into the new instance (a so called memory copy) this means that all attributes are copied over into the new instance if there are fields that are not value types like int etc. but Objects like string etc. then the references are copied resulting in a shallow copy. \nIf you wish for it not to be a shallow copy you have to override the clone method and clone all the object attributes correctly. When overriding the clone method you can strengthen the result type i.e not returning Object but the correct type. Final fields will just be copied they can not be changed if this is needed then you need to use copy constructors.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Prototype", "Header 2": "Solutions", "Header 3": "Cloning based on copy constructors", "path": "../pages/digitalGarden/cs/patterns/prototype.mdx"}, "page_content": "Copy constructors are constructors that receive an instance of their own type as parameter. They then initialize the new instance with the same values as the prototype passed.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Prototype", "Header 2": "Solutions", "Header 3": "Cloning based on serialization", "path": "../pages/digitalGarden/cs/patterns/prototype.mdx"}, "page_content": "If all the classes to be cloned are serializable (implement the Serializable interface), cloning can also be implemented with the help of Java serialization. The clone method which follows this approach looks as follows \n```java\nObject clone() {\nByteArrayOutputStream baos = new ByteArrayOutputStream();\nObjectOutputStream oos = new ObjectOutputStream(baos);\noos.writeObject(this);\noos.close();\nbyte buf[] = baos.toByteArray();\nByteArrayInputStream bais = new ByteArrayInputStream(buf);\nObjectInputStream ois = new ObjectInputStream(bais);\nObject c = ois.readObject();\nreturn c;\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "SOLID", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "The SOLID principles are a set of five design principles that help software developers create more maintainable,\nflexible, and robust code. They were frist introduced by Robert C. Martin \"Uncle Bob\" in his book \"Agile Software\nDevelopment, Principles, Patterns, and Practices\".", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "SOLID", "Header 3": "Single Responsibility Principle", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "A class/method should only have a single purpose/responsibility and therefore only one reason to change. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "SOLID", "Header 3": "Open/Closed Principle", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "Classes should be open for extension but closed for modification. To extend the behavior, new code should be added however old code should not have to be modified. This then prevents situations in which a change to classes also requires adaption of all depending classes. This is achieved with interfaces which allow different implementations by keep the same API. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "SOLID", "Header 3": "Liskov Substitution Principle", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "Subtypes should be substitutable for their base types. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "SOLID", "Header 3": "Interface Segregation Principle", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "Make fine-grained interfaces that are client specific instead of general purpose interfaces (Which slightly contradicts the strategy pattern). Clients should not be forced to depend on methods that they do not use. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "SOLID", "Header 3": "Dependency Inversion Principle", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "Depend on Abstractions not on concrete classes etc. \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "Other Good Coding Principles", "Header 3": "Favor Composition over Inheritance", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "By using composition for example in the strategy pattern instead of inheritance it allows us to be flexible at runtime.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "Other Good Coding Principles", "Header 3": "Program to an interface, not and implementation", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "Avoid referencing concrete classes, declare interfaces instead as then implementations are easily switched out.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "Other Good Coding Principles", "Header 3": "Encapsulate what varies", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "By encapsulating/hiding the parts that can vary for example implementations behind interfaces we can minimize the impact\nof that code because thanks to the interface we have a unified API.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "OOP Design Principles", "Header 2": "Other Good Coding Principles", "Header 3": "KISS", "path": "../pages/digitalGarden/cs/patterns/oopDesignPrinciples.mdx"}, "page_content": "Keep it simple stupid. The simpler the code the easier it is to maintain and understand.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Factory", "path": "../pages/digitalGarden/cs/patterns/factory.mdx"}, "page_content": "Factory patterns are so called creational patterns meaning their intent is to abstract/hide how objects are created. Which in return allows the client to be in independent of how its objects are created.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Factory", "Header 2": "Factory method", "path": "../pages/digitalGarden/cs/patterns/factory.mdx"}, "page_content": "The factory method pattern delegates the instantiation of objects to a method in either subclasses or a static method. The disadvantage of having the factory method as a static method is it can not be subclassed to change the behavior however you don't need to create an object to make use of the method.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Factory", "Header 2": "Factory method", "Header 3": "Structure", "path": "../pages/digitalGarden/cs/patterns/factory.mdx"}, "page_content": "```mermaid\nclassDiagram\nCreator <|-- ConcreteCreator\nCreator --> ProductInterface\nProductInterface <|-- ConcreteProduct\nclass Creator{\n+someOperation()\n+createProduct(): Product\n}\nclass ProductInterface{\n}\nclass ConcreteProduct{\n}\nclass ConcreteCreator{\n+createProduct(): Product\nreturns a ConcreteProduct\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Factory", "Header 2": "Factory method", "Header 3": "Example", "path": "../pages/digitalGarden/cs/patterns/factory.mdx"}, "page_content": "", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Factory", "Header 2": "Abstract factory", "path": "../pages/digitalGarden/cs/patterns/factory.mdx"}, "page_content": "The factory method pattern delegates the instantiation of object familys to a another object.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Factory", "Header 2": "Abstract factory", "Header 3": "Structure", "path": "../pages/digitalGarden/cs/patterns/factory.mdx"}, "page_content": " \nA big question here is where is the concrete Factory so they can all have acces to it. Often this is done in it's own class \n```java\npublic class CurrentFactory {\nprivate CurrentFactory() { }; // prevents instantiation\nprivate static Factory fac = null;\npublic static Factory getFactory() { return fac; }\npublic static void setFactory(Factory f) {\nif (f == null) throw new NullPointerException();\nfac = f;\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Factory", "Header 2": "Abstract factory", "Header 3": "Example", "path": "../pages/digitalGarden/cs/patterns/factory.mdx"}, "page_content": "", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Iterators", "path": "../pages/digitalGarden/cs/patterns/iterator.mdx"}, "page_content": "Often times you find yourself traversing/iterating over a [collection](../algorithmsDataStructures/collections). The iterator pattern encapsulates this functionality without exposing its underlying representation of the collection. In java this pattern is used to implement the enhanced for or also called for-each loop. Under the hood all the for each loop does is get an iterator instance and iterate over it step by step, this is also why you can only use the for-each loop on collections that actually implement the [`Iterable`](https://docs.oracle.com/javase/8/docs/api/java/lang/Iterable.html) interface. \nWhy is this pattern useful? \n- Because depending on the collection there might be different ways to traverse it, for example [binary trees can be traversed in many orders](../algorithmsDataStructures/trees/binaryTrees#traversal-orders) but the methods you call to traverse the collection should be the same. We want a common interface for traversing different collections in different ways.\n- Also, because the iterator object holds all the details regarding the traversal of the collection, several iterators can go through the same collection at the same time, independently of each other as long as they don't change the underlying collection, if they do it gets a bit complicated with [mutexes](../concurrentProgramming/locking#locks). \n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Iterators", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/iterator.mdx"}, "page_content": "The `Iterable` interface defines the method `Iterator iterator()`, or `Iterator createIterator()`which is responsible for creating and returning the [iterator interface](https://docs.oracle.com/javase/8/docs/api/java/util/Iterator.html) that can then be used to traverse the collection containing elements of type `T`. \nAn iterator always holds the value of the next element, apart from at the beginning of an iteration, where it holds a reference to the first element. In java the iterator instance returned must implement the `Iterator` interface which declares the operations required for traversing the collection. \n\nAn animation of the iterator moving through a linked list would be cool.\n \n\nWhen implementing an iterator it is recommended to do so in an internal private final class in the collection class as you then have access to the internal structure of the collection without having to make all the details public.\n \nThe pattern then has an overall structure that can look something like this: \n```plantuml\n@startuml\n!theme purplerain from https://raw.githubusercontent.com/LuciferUchiha/nextra-garden/main\ninterface \"Iterable\" as Iterable {\nIterator iterator()\n}\n\ninterface \"Iterator\" as Iterator {\nT next()\nboolean hasNext()\n}\nclass \"ConcreteIterator\" as ConcreteIterator {\nT next()\nboolean hasNext()\n}\nclass \"Collection\" as Collection {\nIterator iterator()\n}\n\nCollection --|> Iterable\nConcreteIterator --|> Iterator\nCollection --> ConcreteIterator\nIterable --> Iterator\n@enduml\n``` \n- The `T next()` method returns the element the iterator is currently pointing to and advances the iterator to the next element.\n- The `boolean hasNext()` method returns false once the iterator has reached the end of the collection, otherwise true.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Iterators", "Header 2": "ListIterator", "path": "../pages/digitalGarden/cs/patterns/iterator.mdx"}, "page_content": "In Java, there is also the [`ListIterator`](https://docs.oracle.com/javase/8/docs/api/java/util/ListIterator.html) interface which extends the `Iterator` interface. This interface adds functionality that allows for iteration in both directions with `T next()` and `T previous()`. Matchingly it also offers a `boolean hasPrevious()` to the `boolean hasNext()` method. One can imagine that ListIterator has no current element, its position is always between the element that would be returned by a call to `previous()` and the element that would be returned by a call to `next()`. \nBecause the iterator traverses a collection like a list it also works and assigns an index to each element which can be fetched.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Iterators", "Header 2": "Removing Elements", "path": "../pages/digitalGarden/cs/patterns/iterator.mdx"}, "page_content": "The iterator interface also defines an optional `void remove()` method, that removes the most recently returned element from the iterator. \n \nHowever, when removing elements whilst other iterators are also traversing the collection, for example in a concurrent program some problems can occur, mainly in consistent state and iterator pointers getting cut of from iterating further. One way of solving these problems is by using a modification counter (modCount) which is incremented whenever the underlying collection is changed, for example when adding or removing an element. When an iterator is instantiated the modCount is copied and continuously checked if it is the same as the underlying modCount of the collection if not then a `ConcurrentModificationException` is thrown. How this works for different collections in java is [explained here](https://stackoverflow.com/a/5847949/10994912).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Iterators", "Header 2": "Implementation", "path": "../pages/digitalGarden/cs/patterns/iterator.mdx"}, "page_content": "A possible implementation for an iterator over a [linked list](../algorithmsDataStructures/linkedLists) could look like this: \n```java\n\nclass SomeCollection implements Iterable {\n\n//implementation of other functions and internal state\n\n@Override\npublic Iterator iterator() {\nreturn new MyIterator();\n}\n\nprivate final class MyIterator implements Iterator {\n// p and pp keeps track of the previous elements to be able to remove\nprivate Node next = first, p = null, pp = null;\nprivate int myModCount = modCount; // copy the modCount of the collection\nprivate boolean mayRemove = false;\n\n@Override\npublic boolean hasNext() {\nreturn next != null;\n}\n\n@Override\npublic T next() {\nif (modCount != myModCount)\nthrow new ConcurrentModificationException();\nif (next == null)\nthrow new NoSuchElementException();\nT elem = next.elem;\nif (p != null) pp = p;\np = next;\nnext = next.next;\nmayRemove = true;\nreturn elem;\n}\n\n@Override\npublic void remove() {\nif (modCount != myModCount)\nthrow new ConcurrentModificationException();\nif (!mayRemove)\nthrow new IllegalStateException();\nif (pp != null) pp.next = next;\nelse first = next;\nif (next == null) last = pp;\np = pp;\nmayRemove = false;\nsize--;\nmodCount++;\nmyModCount = modCount;\n}\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Singleton", "path": "../pages/digitalGarden/cs/patterns/singleton.mdx"}, "page_content": "The intent of the singleton pattern is that a class only has a single instance which is accessed over a global point. For example config classes should only exist once. You can do this by making the class final to stop inheritance, by making the constructor private so no instance can be created and adding a static method that creates the one instance if not already and returns it. It should either not support cloning or return the same instance(this).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Singleton", "Header 2": "Things to be aware of", "Header 3": "Eager and lazy initialization", "path": "../pages/digitalGarden/cs/patterns/singleton.mdx"}, "page_content": "Eager means the instance is created as soon as the class is first initialized form for example other methods in the class which could use a lot of memory allthough it is not needed. \n```java\npublic final class Singleton {\nprivate Singleton(){}\nprivate static Singleton instance = new Singleton();\npublic static Singleton getInstance(){}\nreturn instance;\n}\n}\n``` \nLazy means it is created when the getInstance functions is accessed this can however cause issues with multithreading which is why you need to synchronize the method. \n```java\npublic final class Singleton {\nprivate Singleton(){}\nprivate static Singleton instance = null;\npublic static synchronized Singleton getInstance(){\nif(instance == null) instance = new Singleton();\nreturn instance;\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Singleton", "Header 2": "Things to be aware of", "Header 3": "Garbage collection", "path": "../pages/digitalGarden/cs/patterns/singleton.mdx"}, "page_content": "Instance cannot be reclaimed by the garbage collector as they are static. So you should either use `WeakReference`(removed when not referenced by strong references) or `SoftReference`(removed when system is short of memory).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Singleton", "Header 2": "Things to be aware of", "Header 3": "Serialization", "path": "../pages/digitalGarden/cs/patterns/singleton.mdx"}, "page_content": "Deserialization of a serialized singleton instance may lead to several singleton instances to avoid this we can do the following \n```java\npublic final class Singleton implements Serializable {\nprivate Singleton(){ }\nprivate static Singleton instance = null;\npublic static synchronized Singleton getInstance(){\nif(instance == null) instance = new Singleton();\nreturn instance;\n}\npublic Object readResolve(){\nreturn getInstance();\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Singleton", "Header 2": "Things to be aware of", "Header 3": "Singelton with Demand Holder Idiom", "path": "../pages/digitalGarden/cs/patterns/singleton.mdx"}, "page_content": "You can also create a Singelton the following way. This solution is thread-safe and is lazy. Important is that the construction of Singelton does not fail (exceptions). \n```java\npublic class Singleton {\nprivate Singleton() {}\n\nprivate static class LazyHolder {\nstatic final Singleton INSTANCE = new Singleton();\n}\n\npublic static Singleton getInstance() {\nreturn LazyHolder.INSTANCE;\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Singleton", "Header 2": "Things to be aware of", "Header 3": "Singelton with Enum", "path": "../pages/digitalGarden/cs/patterns/singleton.mdx"}, "page_content": "You can also create a Singelton with an Enum. It is easy, thread safe and provides the Serialization for free. However it can not be extended to multiple instances and the fields can not be serialized. \n```java\npublic enum SingletonDriver implements Driver {\nINSTANCE;\npublic String toString () { return \"Singleton \";\npublic void playSong(File file ) { ... }\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Command", "path": "../pages/digitalGarden/cs/patterns/command.mdx"}, "page_content": "The intent of the command pattern is to turn commands into stand-alone objects so that the object invoking the command does not need to worry about how the command is done. By doing so you can delay, queue, undo commands.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Command", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/command.mdx"}, "page_content": "```plantuml\n@startuml\n!theme purplerain from https://raw.githubusercontent.com/LuciferUchiha/georgerowlands.ch/main\n\nCommand <-- Invoker : calls\nCommand <|-- CommandImpl\nReceiver <-- CommandImpl : calls\n\nclass Invoker {\nvoid setCommand(Command c)\nvoid execute()\n}\n\ninterface Command{\nvoid execute()\nvoid undo()\n}\n\nclass Receiver{\naction()\n}\n\n@enduml\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Command", "Header 2": "Example", "path": "../pages/digitalGarden/cs/patterns/command.mdx"}, "page_content": "```java\n// Client\npublic class CommandPatternDemo {\npublic static void main(String[] args) {\nStock abcStock = new Stock();\n\nBuyStock buyStockOrder = new BuyStock(abcStock);\nSellStock sellStockOrder = new SellStock(abcStock);\n\nBroker broker = new Broker();\nbroker.takeOrder(buyStockOrder);\nbroker.takeOrder(sellStockOrder);\n\nbroker.placeOrders();\n}\n}\n// Command Interface\npublic interface Order {\nvoid execute();\n}\n// Receiver\npublic class Stock {\nprivate String name = \"ABC\";\nprivate int quantity = 10;\n\npublic void buy(){\nSystem.out.println(\"Stock [ Name: \"+name+\",\nQuantity: \" + quantity +\" ] bought\");\n}\npublic void sell(){\nSystem.out.println(\"Stock [ Name: \"+name+\",\nQuantity: \" + quantity +\" ] sold\");\n}\n}\n// ConcreteCommands\npublic class BuyStock implements Order {\nprivate Stock abcStock;\n\npublic BuyStock(Stock abcStock){\nthis.abcStock = abcStock;\n}\n\npublic void execute() {\nabcStock.buy();\n}\n}\npublic class SellStock implements Order {\nprivate Stock abcStock;\n\npublic SellStock(Stock abcStock){\nthis.abcStock = abcStock;\n}\n\npublic void execute() {\nabcStock.sell();\n}\n}\n// Invoker\npublic class Broker {\nprivate List orderList = new ArrayList();\n\npublic void takeOrder(Order order){\norderList.add(order);\n}\n\npublic void placeOrders(){\nfor (Order order : orderList) {\norder.execute();\n}\norderList.clear();\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Decorator", "path": "../pages/digitalGarden/cs/patterns/decorator.mdx"}, "page_content": "The intent of the decorator pattern is to give objects new responsibilities without overusing inheritance and therefore creating a bunch of classes. Instead Components get decorated to further enhance objects.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Decorator", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/decorator.mdx"}, "page_content": "```mermaid\nclassDiagram\nComponent <|-- IDecorator\nComponent <|-- ConcreteComponent\nIDecorator <|-- ConcreteDecoratorA\nIDecorator <|-- ConcreteDecoratorB\nComponent <-- IDecorator\nclass Component {\n+operationA()\n+operationB()\n}\nclass IDecorator {\nComponent wrappedObj\n+operationA()\n+operationB()\n}\nclass ConcreteDecoratorA{\nObject newState // can extend state\n+operationA()\n+operationB()\n}\n\nclass ConcreteDecoratorB{\n+operationA()\n+operationB()\n+newBehavior() // can add new\n}\n\nclass ConcreteComponent {\n+operationA()\n+operationB()\n}\n``` \nWith this structure you can do things in the decorator before after calling wrappedObj.operationA().", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Decorator", "Header 2": "Example", "path": "../pages/digitalGarden/cs/patterns/decorator.mdx"}, "page_content": "```java\npublic abstract class Beverage {\nString description = \"Unknown Beverage\";\n\npublic String getDescription() {\nreturn description;\n}\n\npublic abstract double cost();\n}\npublic abstract class CondimentDecorator extends Beverage {\npublic abstract String getDescription();\n}\npublic class Espresso extends Beverage {\n\npublic Espresso() {\ndescription = \"Espresso\";\n}\n\npublic double cost() {\nreturn 1.99;\n}\n}\npublic class Milk extends CondimentDecorator {\nBeverage beverage;\n\npublic Milk(Beverage beverage) {\nthis.beverage = beverage;\n}\n\npublic String getDescription() {\nreturn beverage.getDescription() + \", Milk\";\n}\n\npublic double cost() {\nreturn .10 + beverage.cost();\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Observer", "path": "../pages/digitalGarden/cs/patterns/observer.mdx"}, "page_content": "The intent of the Observer Pattern is that there a dependant objects of another object and that these dependent objects can be notified of changes without the other object knowing of the dependent objects and the connection between the the cooperating objects being to tight. The Observer Pattern is also commonly know as listener or Publish-Subscribe pattern. \nYou can find a good detailed description [here](https://refactoring.guru/design-patterns/observer)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Observer", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/observer.mdx"}, "page_content": "```mermaid\nclassDiagram\nSubjectInterface <|-- ConcreteSubject\nSubjectInterface --> ObserverInterface\nObserverInterface <|-- ConcreteObserver\nConcreteObserver --> ConcreteSubject\nclass SubjectInterface{\n+registerObserver()\n+removeObserver()\n+notifyObservers()\n}\nclass ConcreteSubject{\n-state\n+registerObserver()\n+removeObserver()\n+notifyObserver()\n+getState()\n+setState()\n}\nclass ObserverInterface{\n+update()\n}\nclass ConcreteObserver{\n-observedObject\n+update()\n}\n``` \nOften instead of a interface for subject abstract classes are used as the behaviour for the 3 functions is always the same and you can add the data structure for storing the observers. \nIn java you can use the provided `java.util.Observable` and `java.util.Observer` but is often not recommended.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Observer", "Header 2": "Things to be aware of", "Header 3": "How to notify", "path": "../pages/digitalGarden/cs/patterns/observer.mdx"}, "page_content": "There are a few ways to notify the observers. \nFirstly there is the question of when to notify observers. Either they are notified when the `setState` function is called which is in most cases the way to go and the easiest to implement however it can lead to lots of updates. Or you can explicitly call the update function when you think it is needed, you just need to make sure you don't forget. \nThen there is the question of what should a notification look like. We can differentiate between 2 models, push and pull. With pull the object is just notified that there has been a change and then has to get (pull) the actually changes itself. \n- `update()` without any parameters. \nOr there is the push option where you tell the observer what and where has been changed. This can be done in multiple ways. \n- `update(Subject s, Color c)` with sender and/or the exact data that changed. Is easy but can be hard when multiple things changed.\n- `update(Subject s, Object args)` with sender and/or Object containing the the changes or the new state, as often seen in C#.\n- `update(Event e)` an event object that contains everything as seen in JavaFx or Swing.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Observer", "Header 2": "Things to be aware of", "Header 3": "Other small things", "path": "../pages/digitalGarden/cs/patterns/observer.mdx"}, "page_content": "There is clearly loose coupling between the subject and the observer as the subject does not know any concrete observers only the interface. \nA notification is broadcast, meaning it is sent to all registered observers. It is then up to the observer how it handles notifications. \nA simple change of the subjects state can cause a cascade of updates and therefore then multiple notifications. \nYou must be aware careful that you do not create infinite loops by changing the state in the update function as this will cause the observer to be notified again. \nIt is good practice to make sure the state really has changed before notifying all the observers and to also not allow the same listener to be added multliple times. \nIn java you can use a `CopyOnWriteArrayList` as the data structure as it can happen that observers detach in the update method which would change the loop that you iterate over in `notifyObservers()`", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Observer", "Header 2": "Example", "path": "../pages/digitalGarden/cs/patterns/observer.mdx"}, "page_content": "This example is without the Subject interface however if could be added especially if there will be multiple subjects. \n```java\npublic class NewsAgency{\nprivate String news;\nprivate List channels = new ArrayList<>();\n\npublic void addObserver(Observer channel) {\nthis.channels.add(channel);\n}\n\npublic void removeObserver(Observer channel) {\nthis.channels.remove(channel);\n}\n\npublic void setNews(String news) {\nthis.news = news;\nfor (Observer channel : this.channels) {\nchannel.update(this.news);\n}\n}\n}\n\npublic class NewsChannel implements Observer {\nprivate String news;\n\n@Override\npublic void update(Object news) {\nthis.setNews((String) news);\n}\n}\n\npublic interface Observer {\npublic void update(Object o);\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "State", "path": "../pages/digitalGarden/cs/patterns/state.mdx"}, "page_content": "The intent of the state pattern is that we can alter a programs behaviour (just like the strategy pattern) when its internal state changes and outsource the state-dependent behavior. A common example is some sort of dispenser machine that has multiple different states and depending on actions changes its state. For example a ticketmachine can be in the state \"INIT\" and when the action \"enterMoney\" is executed the machine changes to the state \"MONEY_ENTERED\".", "type": "Document"} -{"id": null, "metadata": {"Header 1": "State", "Header 2": "Structure", "path": "../pages/digitalGarden/cs/patterns/state.mdx"}, "page_content": "```mermaid\nclassDiagram\nStateInterface <-- Context\nStateInterface <|-- ConcreteStateA\nStateInterface <|-- ConcreteStateB\nclass Context{\nrequest() = currentState.handle()\n}\nclass StateInterface{\nhandle()\n}\nclass ConcreteStateA{\nhandle()\n}\nclass ConcreteStateB{\nhandle()\n}\n``` \nThe hardest part of implementing the state pattern is defining the state interface. The easiest way to do so is to draw a state diagram of the system. All the actions are then the methods in the interface and all the states are the concreteStates.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "State", "Header 2": "Things to be aware of", "Header 3": "State Transition", "path": "../pages/digitalGarden/cs/patterns/state.mdx"}, "page_content": "There are a few ways how state transition can be done. \nDecentralized = may be initiated by state objects. For that the state must know its succesors, needs access to a state transition method in the context `context.setState(s);` or return the new state which is then set by the context. \nParameterized = may be signaled by state, executed by context i.e by returning a key e.g a string or int. Association between, keys and states is held in the context. \nCentralized = initiated by the context, state should be informed if it is activated or deactivated.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "State", "Header 2": "Creation of state objects", "path": "../pages/digitalGarden/cs/patterns/state.mdx"}, "page_content": "Created when needed = `c.setState(new StateB());` when state changes are rare.\nCreation ahead of time = `c.setState(c.STATE_B);` states have to be stored in context.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "State", "Header 2": "Example", "path": "../pages/digitalGarden/cs/patterns/state.mdx"}, "page_content": "```java\npublic class TicketMachine{\nprivate int destination;\nprivate boolean firstClass, dayTicket, halfPrice;\nprivate double price, enteredMoney;\nprivate interface State {\nvoid setDestination(intdestination);\nvoid setFirstClass(booleanfirstClass);\nvoid setDayTicket(booleandayTicket);\nvoid setHalfPrice(booleanhalfPrice);\nvoid enterMoney(double amount);\nvoid cancel();\n}\nprivate final State INIT = new StateInit();\nprivate final State DEST_SELECTED = new StateDestSelected();\nprivate final State MONEY_ENTERED = new StateMoneyEntered();\nprivate State state = INIT;\n\npublic void enterMoney(double amount) {\nstate.enterMoney(amount);\n}\n// etc...\nabstract class AbstractState implements State {\npublic void setDestination(intdestination) {\nthrow new IllegalStateException(); }\npublic void setFirstClass(booleanfirstClass) {\nthrow new IllegalStateException(); }\npublic void setDayTicket(booleandayTicket) {\nthrow new IllegalStateException(); }\npublic void setHalfPrice(booleanhalfPrice) {\nthrow new IllegalStateException(); }\npublic void enterMoney(double amount) {\nthrow new IllegalStateException(); }\npublic void cancel() { state = INIT; }\n}\nclass StateDestSelected extends AbstractState{\npublic void setFirstClass(boolean fc) {\nfirstClass= fc;\nprice = calculatePrice(destination, firstClass);\n}\npublic void enterMoney(double amount) {\nstate= MONEY_ENTERED; state.enterMoney(amount);\n}\n}\nclass StateMoneyEntered extends AbstractState{\npublic void enterMoney(double amount) {\nenteredMoney += amount;\nif (enteredMoney>= price) {\nprintTicketWithChange(destination, price, firstClass);\nstate = INIT;\n}\n}\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Fluent in Python", "path": "../pages/digitalGarden/cs/python/fluent.mdx"}, "page_content": "Some key notes and takeaways from the book Fluent Python by Luciano Ramalho.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Fluent in Python", "Header 2": "Chapter 1: The Python Data Model", "path": "../pages/digitalGarden/cs/python/fluent.mdx"}, "page_content": "- Make use of it, it's there for a reason. It make implementing collections and classes easier.\n- Implement the special/magic/dunder methods to make your classes behave like built-in types.\n- Don't call special methods directly, use built-in functions instead like `len()`, `iter()`, `str()`, etc. rather than\n`obj.__len__()`, `obj.__iter__()`, `obj.__str__()`, etc.\n- collections.namedtuple is a great way to create a class that is just a collection of attributes.\n- ABC = Abstract Base Class\n- `__repr__` is for developers, `__str__` is for end users. If you only implement one, implement `__repr__`.\n- By default custom classes are truthy, unless you implement `__bool__` or `__len__` and return `False` or `0` respectively.\n- For numpy and some built-in types like list or str that are implemented in C, `__len__` is a C function that returns\nthe value of the `ob_size` field in the `PyObject` struct that represents any variable in the CPython implementation.\nThis is done for performance reasons. Is it important to know this? Probably not, but it's interesting. \n```python\nimport collections\n\nCard = collections.namedtuple('Card', ['rank', 'suit'])\n\nclass FrenchDeck:\nRANKS = [str(n) for n in range(2, 11)] + list('JQKA')\nSUIT_VALUES = {'Spades': 3, 'Hearts': 2, 'Diamonds': 1, 'Clubs': 0}\n\ndef __init__(self):\n# cartesian product of ranks and suits using listcomps\nself._cards = [Card(rank, suit) for suit in self.SUIT_VALUES\nfor rank in self.RANKS]\n\ndef __len__(self):\nreturn len(self._cards)\n\ndef __getitem__(self, position):\nreturn self._cards[position]\n\ndef card_value(self, card):\nrank_value = self.RANKS.index(card.rank)\nreturn rank_value * len(self.SUIT_VALUES) + self.SUIT_VALUES[card.suit]\n\ndeck = FrenchDeck()\nsorted_deck = sorted(deck, key=deck.card_value)\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Fluent in Python", "Header 2": "Chapter 2: An Array of Sequences", "path": "../pages/digitalGarden/cs/python/fluent.mdx"}, "page_content": "- Listcomps are faster than map and filter, and more readable than the equivalent for loop.\n- Generator expressions are memory efficient and can be used as function arguments. `tuple(ord(symbol) for\nsymbol in symbols)`\n- Tuples are immutable, but the objects they contain may be mutable!\n- Tuples don't need to be seen as immutable lists, they can be used as records with no field names.\n- Unpacking can be used to swap variables `a, b = b, a` or to discard values `_, b = (1, 2)` or to split a list into\nhead and tail `head, *tail = [1, 2, 3, 4]` or return multiple values from a function `return a, b`.\n- The `*` operator can be used to grab excess items, kind of a wild card.\n- Pattern matching is cool and generally simple. But some patterns can be complex and hard to read.\n- `collections.deque` is a great data structure for implementing a queue. It's a doubly linked list with O(1) time\ncomplexity for adding or removing items from either end. Where as a list needs to shift all the items.\n- `collections.deque` can be used to implement a bounded queue i.e. fixed size queue by passing a `maxlen` argument to\nthe constructor.\n- slices are objects and can be used as arguments to functions and stored in variables.\n- lists can be manipulated in place using slice assignment `l[2:5] = [20, 30]` or `del l[5:7]`.\n- Memory views seem complicated and I don't really understand them yet.\n- `array.array` is a great way to store a large number of numerical values. It's more efficient than a list of ints\nbut can only store one type of value.\n- there is also the bisect module which provides binary search and insertion into sorted sequences which can be handy.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Actor Model", "path": "../pages/digitalGarden/cs/distributedSystems/actorModel.mdx"}, "page_content": "The actor model is not just good for concurrent programming but also for distributed systems as the actors can be on different systems and still communicate with each other.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Actor Model", "Header 2": "Java Akka", "path": "../pages/digitalGarden/cs/distributedSystems/actorModel.mdx"}, "page_content": "We will be using the Akka framework again but instead of using the [Scala version of Akka](../Concurrent%20Programming/13-actorModel.md) we will be using the java version which is very similar.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Actor Model", "Header 2": "Java Akka", "Header 3": "Creating an Actor", "path": "../pages/digitalGarden/cs/distributedSystems/actorModel.mdx"}, "page_content": "In Java, actors extend the AbstractClass and must implement the `Receive createReceiver()` method. The following actor discards all received messages because no matching is done. \n```java\npublic class PrintActor extends AbstractActor {\n@Override\npublic Receive createReceiver() {\nreturn receiveBuilder().build();\n}\n}\n``` \nTo react to messages we can use pattern matching like the example below: \n```java\npublic class PrintActor extends AbstractActor {\nprivate int cnt = 0;\n@Override\npublic Receive createReceive() {\nreturn receiveBuilder().matchAny(t -> onReceive(t)).build();\n}\nprivate void onReceive(Object msg) {\ncnt++;\nif (msg instanceof String) {\nSystem.err.println(cnt + \": received message \" + msg);\n} else {\nSystem.err.println(cnt + \": received unknown message\");\n}\n}\n}\n``` \nThe actual actor objects are then created and started asychnronsly by using the actor system. Just like in Scala when trying to create and actor object using `new` an `ActorInitializationException` is thrown. \n```java\npublic static void main(String[] args) throws Exception {\nActorSystem as = ActorSystem.create();\nActorRef actor = as.actorOf(\nProps.create(PrintActor.class),\n\"Printer\" // name is optional and must be unique\n); // returns an immutable reference\n// ActorRef print = as.actorOf(Props.create(PrintActor.class, \"Msg:\")); For non default constructor actors\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Actor Model", "Header 2": "Java Akka", "Header 3": "Sending Messages", "path": "../pages/digitalGarden/cs/distributedSystems/actorModel.mdx"}, "page_content": "In the scala version of Akka messages could be sent using the tell operator `!`. However, this is not possible in Java for syntax reasons so instead, messages can be sent to an actor by calling a member method on the receiving actor. Just like in Scala the message is guaranteed to be delivered at most once. \n```java\nreceivingActor.tell(msg, ActorRef.noSender()) // ActorRef.noSender() is same as null\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Actor Model", "Header 2": "Java Akka", "Header 3": "Receiving Messages", "path": "../pages/digitalGarden/cs/distributedSystems/actorModel.mdx"}, "page_content": "In Java there are the following methods that can be used for pattern matching: \n- `matchAny(UnitApply apply`) Matches any argument.\n– `match(Class

type, UnitApply

apply)` Matches an argument of a particular type.\n– `match(Class

type, TypedPredicate

p, UnitApply

app)` Matches an argument of a particular type that matches a given predicate. \nFor example: \n```java\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(String.class,\ns -> s.startsWith(\"MSG:\"),\nmsg -> System.err.println(cnt++ + \": received message \" + msg\n));\n}\n``` \nThe above functions also allow you to then do pattern matching very similarly to pattern matching in scala using case classes (in Java records). \n```java\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(LoginMessage.class, msg -> {\nString username = msg.username();\nsessions.put(username, getSender());\nbroadcastMessage(username, \"I just logged in\");\n})\n.match(TextMessage.class, msg -> {\nbroadcastMessage(msg.username(), msg.message());\n})\n.match(LogoutMessage.class, msg -> {\nString username = msg.username();\nsessions.remove(username);\nbroadcastMessage(username, \"I just logged out\");\ngetSender().tell(msg, getSelf());\n})\n.matchAny(msg -> unhandled(msg))\n.build();\n}\n``` \nInside the Actor there are also the following methods that can be used just like in scala: \n- `getSelf()` Returns the actor reference to itself.\n- `getSender()` Returns the actor reference to the sender of the currently processed message.\n- `getContext()` Returns this actors context, may be used to create child actors.\n- `forward(Object message, ActorContext context)` Forwards the message and passes the original sender actor as the sender. Same as `a.tell(msg, getSender())`.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Actor Model", "Header 2": "Java Akka", "Header 3": "Distributed Actors", "path": "../pages/digitalGarden/cs/distributedSystems/actorModel.mdx"}, "page_content": "To use Akka in a distributed system actors need to be configured. Configurations can include log levels, message serializers, protocol specifics etc. \nYou can either define one big config file for all actors with the followin strucutre: \n```yaml title=\"application.conf\"\nPrintConfig {\nakka {\nactor {\nprovider = remote\n}\nremote {\nartery {\ntransport = tcp\ncanonical.hostname = \"127.0.0.1\"\ncanonical.port = 2552\n}\n}\n}\n}\n\nOtherConfig {\nakka {\nactor {\n...\n``` \nAnd then use them like the following: \n```java\npublic static void main(String[] args) {\nConfig config = ConfigFactory.load().getConfig(\"PrintConfig\"); // without getConfig just gets base.\nSystem.out.println(c.getInt(\"akka.remote.artery.canonical.port\")); // 25520\nActorSystem sys = ActorSystem.create(\"PrintApplication\", config);\nsys.actorOf(Props.create(PrintActor.class), \"PrintServer\");\nSystem.out.println(\"Started Print Application\");\n}\n``` \nOr save each config in a separate file like `foo.conf` and then read the config with `ConfigFactory.load(\"foo\")`. \nIf you for some reason don't have an ActorRef to a distrubted actor or want to make use of wildcards you can do this with the `actorSelection()` function: \n```java\nActorSelection actor = as.actorSelection(\"akka://PrintApplication@127.0.0.1:25520/user/PrintServer\"); // PrintApplication is the name of the system. user is always there for some reason.\nActorSelection actor = as.actorSelection(\"akka://PrintApplication@127.0.0.1:25520/user/*/PrintServer\"); // all printServer no matter the parent Actor\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Actor Model", "Header 2": "Ask Pattern", "path": "../pages/digitalGarden/cs/distributedSystems/actorModel.mdx"}, "page_content": "In the Java version of Akka you can also use the ask pattern. This can be used when you expect an answer to a sent message. Instead of using the ask `?` operator, there is the `ask(ActorRef ref, Object msg, long timeout)` method defined in the `akka.pattern.Patterns` package as a static method which returns an **Akka Future** not to mixed up with the normal Java Future. The Future is either a Success Object containing the response message or a Failure containing an AskTimeoutException. \n```java\nTimeout timeout = new Timeout(5, TimeUnit.SECONDS);\nFuture res = Patterns.ask(actor, message, timeout);\nreturn (String) Await.result(res, timeout.duration()); // blocking\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Websockets", "Header 2": "Bidirectional Communication", "path": "../pages/digitalGarden/cs/distributedSystems/websockets.mdx"}, "page_content": "What if we wanted a system that could send notifications to the client asynchronously. This is exactly what WebSockets do, they set up a bidirectional channel using HTTP/TCP and enable server-driven, full-duplex messaging. \n![webSocketsCommunication](/compSci/webSocketsCommunication.png)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Websockets", "Header 2": "Specification", "path": "../pages/digitalGarden/cs/distributedSystems/websockets.mdx"}, "page_content": "The WebSocket protocol specification defines `ws` (WebSocket) and `wss` (WebSocket Secure) as two new uniform resource identifier (URI) schemes[8] that are used for unencrypted and encrypted connections respectively. To start a WebSocket Channel there needs to be an initial handshake which is done over HTTPS using the `upgrade` header. \n```cmd title=\"Request\"\nGET /examples/websocket/echoStream HTTP/1.1\nHost: server.example.com\nConnection: Upgrade\nUpgrade: websocket\nSec-Websocket-Key: mqn5Pm7wtXEX6BzqDInLjw==\nSec-Websocket-Version: 13\n``` \n```cmd title=\"Response\"\nHTTP/1.1 101 Switching Protocols\nServer: Apache-Coyote/1.1\nUpgrade: websocket\nConnection: upgrade\nSec-WebSocket-Accept: +TdGPOkAq62+toDOhVGj2QZWwg8=\nDate: Thu, 04 Apr 2021 19:21:39 GMT\n``` \nThe return key verifies, that the server understood the request and is calculated like the following: \n```java\nString KEY_SUFFIX = \"258EAFA5-E914-47DA-95CA-C5AB0DC85B11\";\nString computeReturnKey(String key) throws Exception {\nMessageDigest md = MessageDigest.getInstance(\"SHA-1\");\nbyte[] res = md.digest((key+KEY_SUFFIX).getBytes(Charset.forName(\"ascii\")));\nreturn Base64.encodeBytes(res);\n}\n``` \nA message has the following format: \n![websocketMsgFormat](/compSci/websocketMsgFormat.png) \nThe fields having the following meaning: \n- FIN marks the final fragment in a message\n- RSV = 000\n- OPCODE, operation code\n- 0x0 continuation frame\n- 0x1 text frame\n- 0x2 binary frame\n- 0x8 close\n- 0x9 ping\n- 0xA pong\n– MASK indicates content obfuscation (XOR masking) \nA message could look something like this: \n```cmd\n0x81 1000 0001 Final Fragment | Text frame\n0x85 1 000 0101 Masked / length = 5\n0x96 1001 0110 Masking Key\n0xa7 1010 0111 Masking Key\n0x2b 0010 1011 Masking Key\n0x38 0011 1000 Masking Key\n0xde 1101 1110 xor 10010110 = 0100 1000 H\n0xc2 1100 0010 xor 10100111 = 0110 0101 e\n0x47 0100 0111 xor 00101011 = 0110 1100 l\n0x54 0101 0100 xor 00111000 = 0110 1100 l\n0xf9 1111 1001 xor 10010110 = 0110 1111 o\n``` \nAn implementation of this process could be: \n```java", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Websockets", "Header 2": "Specification", "path": "../pages/digitalGarden/cs/distributedSystems/websockets.mdx"}, "page_content": "0xc2 1100 0010 xor 10100111 = 0110 0101 e\n0x47 0100 0111 xor 00101011 = 0110 1100 l\n0x54 0101 0100 xor 00111000 = 0110 1100 l\n0xf9 1111 1001 xor 10010110 = 0110 1111 o\n``` \nAn implementation of this process could be: \n```java\nbyte[] maskingKey; // random masking key, 4 bytes\nbyte[] payloadData; // data to be transmitted\nbyte[] maskedData; // masked data to be generated\n// mask (on client)\nfor(int i = 0; i < payloadData.length; i++)\nmaskedData[i] = payloadData[i] ^ maskingKey[i%4];\n}\n//unmask (on server)\nfor(int i = 0; i < maskedData.length; i++)\npayloadData[i] = maskedData[i] ^ maskingKey[i%4];\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Websockets", "Header 2": "Specification", "Header 3": "Sub-Protocols", "path": "../pages/digitalGarden/cs/distributedSystems/websockets.mdx"}, "page_content": "WebSockets also offer the option for the client and server to agree on a protocol with which the transmitted data will be formatted and interpreted. Examples of sub-protocols are JSON, XML, MQTT, WAMP, STOMP, SOAP. These protocols can ensure agreement not only about the way the data is structured but also about the way communication must commence, continue and eventually terminate. As long as it is defined in the handshake with the `Sec-WebSocket-Protocol` header and both parties understand what the protocol entails, anything goes.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Websockets", "Header 2": "JSR 356 Example", "path": "../pages/digitalGarden/cs/distributedSystems/websockets.mdx"}, "page_content": "```java\n@ClientEndpoint\npublic class EchoClient {\n\nprivate static CountDownLatch latch = new CountDownLatch(1);\n\n@OnOpen\npublic void onOpen(Session session) throws IOException {\nSystem.out.println(\"onOpen \" + Thread.currentThread());\nsession.getBasicRemote().sendText(\"Hello\");\n// session.getBasicRemote().sendBinary(ByteBuffer.wrap(new byte[]{'h', 'e', 'l', 'o'})); // sends a binary message\n\n// session.getBasicRemote().sendText(\"Hello\", false);\n// session.getBasicRemote().sendText(\"World\", true);\n}\n\n@OnMessage\npublic void onMessage(Session session, String message) throws IOException {\nSystem.out.println(\"onMessage \" + message + \" \" + Thread.currentThread());\nsession.close();\n}\n\n@OnClose\npublic void onClose(Session session, CloseReason closeReason) {\nSystem.out.printf(\"[%s] Session %s closed because of %s\\n\", Thread.currentThread(), session.getId(), closeReason);\nlatch.countDown();\n}\n\n@OnError\npublic void onError(Throwable exception, Session session) {\nSystem.out.println(\"an error occured on connection \" + session.getId() + \":\" + exception);\n}\n\npublic static void main(String[] args) throws Exception {\n// URI url = new URI(\"ws://86.119.38.130:8080/websockets/echo\");\nURI url = new URI(\"ws://localhost:2222/websockets/echo\");\n\n//System.out.println(Thread.currentThread());\nClientManager client = ClientManager.createClient();\nclient.connectToServer(EchoClient.class, url);\nlatch.await();\n}\n}\n``` \n```java\n@ServerEndpoint(\"/echo\")\npublic class EchoServer {\n\n{\nSystem.out.println(\"EchoServer created \" + this);\n}\n\npublic static void main(String[] args) throws Exception {\nServer server = new Server(\"localhost\", 2222, \"/websockets\", null, EchoServer.class);\nserver.start();\nSystem.out.println(\"Server started, press a key to stop the server\");\nSystem.in.read();\n}\n\n@OnOpen\npublic void onOpen(Session session) {\nSystem.out.printf(\"New session %s\\n\", session.getId());\n}\n\n@OnClose\npublic void onClose(Session session, CloseReason closeReason) {\nSystem.out.printf(\"Session %s closed because of %s\\n\", session.getId(), closeReason);\n}", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Websockets", "Header 2": "JSR 356 Example", "path": "../pages/digitalGarden/cs/distributedSystems/websockets.mdx"}, "page_content": "@OnClose\npublic void onClose(Session session, CloseReason closeReason) {\nSystem.out.printf(\"Session %s closed because of %s\\n\", session.getId(), closeReason);\n}\n\n@OnMessage\npublic String onMessage(String message, Session session) {\nSystem.out.println(\"received message form \" + session.getBasicRemote() + \": \" + message);\nreturn \"echo \" + message;\n}\n\n@OnError\npublic void onError(Throwable exception, Session session) {\nSystem.out.println(\"an error occured on connection \" + session.getId() + \":\" + exception);\n}\n\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Networking", "Header 2": "TCP/IP and OSI Model", "path": "../pages/digitalGarden/cs/distributedSystems/networking.mdx"}, "page_content": "We define protocols as a means to standardize how computers interact with each other no matter the manufacturer or the parts inside. The most commonly used protocols are TCP, UDP and IP. The TCP/IP and OSI models splits protocols into four layers depending on their tasks. \n![protocolModels](/compSci/protocolModels.png)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Networking", "Header 2": "IP Addressing", "path": "../pages/digitalGarden/cs/distributedSystems/networking.mdx"}, "page_content": "IP addresses are used to uniquely identify devices inside a network and are most commonly used in the IP protocol. There are IPv4 addresses which take up 32 bits and IPv6 addresses which take up 128 bits. \n![ipAddresses](/compSci/ipAddresses.png) \nUnicast addresses belong to a single network interface and a packet that is sent to a unicast address is delivered to the interface identified by that address. \nLoopback addresses are the addresses assigned to the loopback interface. Anything sent to these IP addresses loops around and becomes IP input on the local host. These addresses are often used when testing a client. \nMulticast addresses belong to a set of interfaces. A packet sent to a multicast address is delivered to all interfaces identified by\nthat address. \nA broadcast address is used to target all systems on a specific subnet network instead of single hosts. In other words broadcast addresses allow information to be sent to all machines on a given subnet rather than to a specific machine so can also be classified as a multicast address.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Networking", "Header 2": "Sockets", "path": "../pages/digitalGarden/cs/distributedSystems/networking.mdx"}, "page_content": "Sockets are an abstraction through which applications can send and receive data through a network. A socket is one endpoint of a two-way communication link between two programs running on the network. A socket is bound to a port number so that the transport layer can identify the application that the data is destined to be sent to. \nServers waits for requests on a particular port, when a client connects to the server service it discloses its own address and port so that the server knows where to send the response. \nThere are stream Sockets which use the TCP protocol and provide a reliable byte stream between the two applications. Packages are delivered in the correct order and lost packages are retransmitted with help of the TCP protocol. The connection is also full duplex meaning data can be sent and received over one connection instead of needing one for each operation. When transmission is finished one or both parties close the connection. \nDatagram Sockets use the UDP protocol and aren't as reliable as Stream Sockets. \n![sockets](/compSci/sockets.png)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Messaging", "Header 2": "Asynchronous Communication", "path": "../pages/digitalGarden/cs/distributedSystems/messaging.mdx"}, "page_content": "We have seen synchronous communication already for example sockets or a telephone connection. However, there is also the asynchronous communication model which involves a Message Oriented Middleware (MOM) between the client and the server. This form of communication can be seen for example when sending an email where the Mail Server is the MOM. Asynchronous communication allows for applications to be loosely coupled as they only need to agree on the message format and not the API. It also means that the sender and receiver don't have to be active at the same time. \nWhen it comes to protocols for asynchronous communication there is either the JMS (Java Message Service / Jakarta Messaging) Protocol or other protcols which use the TCP protocol such as: \n- AMQP: Advanced Message Queuing Protocol, which is a Binary protocol with four messaging models (direct, topic, fanout and header).\n- STOMP: Simple (or Streaming) Text Oriented Messaging Protocol, which is a Text based protocol similar to HTTP.\n- MQTT: MQ Telemetry Transport, a lightweight protocol, intended to be used in IoT environments (small footprint) with a publish and subscribe pattern and no queues and supports the delivery guarantees: at least once, at most once, exactly once and last wish?", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Messaging", "Header 2": "AMQP - Advanced Message Queuing Protocol", "path": "../pages/digitalGarden/cs/distributedSystems/messaging.mdx"}, "page_content": "As mentioned the AMQP protocol is a binary protocol that contains the following three key components: \n- Exchanges: Endpoints of the broker that receives messages.\n- Queues: Endpoints that store messages from exchanges and are used by subscribers to retrieve messages.\n- Bindings/Routings: Rules that bind/route exchanges to queues. \nThese concepts are programmable, meaning they can be created, modified and deleted. This also means that there can be multiple channels inside a single TCP connection which can save the overhead of having multiple connections. When a message is sent the sender needs to define the exchanger, the routing key and the payload. \n![amqpComponents](/compSci/amqpComponents.png) \nThere are four patterns when interacting with messages: \n- Direct Exchange: Queues that are bound to an exchanger with the same key that is used to\npublish a message will receive the message\n- Fanout Exchange: Broadcast of the message to all queues that are bound to it (binding key is not used) which is suitable for the publish and subscribe pattern.\n- Topic Exchange: Routes messages to all queues that have a binding key that matches the routing key which is suitable for routing messages to different queues based on the type of message.\n- Headers Exchange: Messages are routed based on custom message headers \n![amqpMessagePatterns](/compSci/amqpMessagePatterns.png)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Messaging", "Header 2": "RabbitMQ", "path": "../pages/digitalGarden/cs/distributedSystems/messaging.mdx"}, "page_content": "Is an open-source message broker written in Erlang that implements the AMQP protocol which can also be extended to MQTT, STOMP and HTTP. \nFirst you need to connect to a broker: \n```java\nConnectionFactory factory = new ConnectionFactory();\nfactory.setUsername(\"username\"); // Default: guest\nfactory.setPassword(\"password\"); // Default: guest\nfactory.setVirtualHost(\"myRabbit\"); // Default: /\nfactory.setHost(\"69.69.69.69\"); // Default: localhost\nfactory.setPort(5672); // Default: 5672\nConnection connection = factory.newConnection();\nChannel channel = connection.createChannel();\n``` \nThe next step would be to declare a queue. Durable queues survive server restarts, exclusive queues are restricted to a connection and auto-delete means the server can delete the queue when it is no longer used. \n```java\nchannel.queueDeclare(QUEUE_NAME,\n/* durable: */ false,\n/* exclusive: */ false,\n/* autoDelete: */ false,\n/* arguments: */ null\n);\n``` \nTo then publish a message we can use the default exchange: \n```java\nString message = \"Hello World at \" + LocalDateTime.now();\nchannel.basicPublish(\n/* exchange: */ \"\",\n/* routing key: */ QUEUE_NAME,\n/* props: */ null,\n/* body: */ message.getBytes(StandardCharsets.UTF_8));\n``` \nTo then finally receive a message callbacks can be registered: \n```java\nDeliverCallback deliverCallback = (consumerTag, message) -> {\nString text = new String(message.getBody(), \"UTF-8\");\n};\nCancelCallback cancelCallback = consumerTag -> {\nSystem.out.println(\"Cancelled by the server\");\n}\nchannel.basicConsume(QUEUE_NAME,\n/* autoAck */ true, deliverCallback, cancelCallback);\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Messaging", "Header 2": "RabbitMQ", "Header 3": "Echo Example", "path": "../pages/digitalGarden/cs/distributedSystems/messaging.mdx"}, "page_content": "```java\nchannel.queueDeclare(RPC_QUEUE_NAME,\n/* durable: */ false,\n/* exclusive: */ false,\n/* autoDelete: */ false,\n/* arguments: */ null);\n\nfinal String corrId = UUID.randomUUID().toString();\n\nString replyQueueName = channel.queueDeclare().getQueue();\nSystem.out.println(replyQueueName);\n\nAMQP.BasicProperties props = new AMQP.BasicProperties\n.Builder()\n.correlationId(corrId)\n.replyTo(replyQueueName)\n.build();\n\nString message = String.format(\"Hello World from %s at %s\",\nSystem.getProperty(\"user.name\"),\nLocalDateTime.now());\nSystem.out.println(message);\n\nchannel.basicPublish(\n/* exchange: */ \"\", // Exchange: empty string is called \"default exchang\" which is a direct exchange.\n/* routing key: */ RPC_QUEUE_NAME,\n/* props: */ props,\n/* body: */ message.getBytes(StandardCharsets.UTF_8));\n\n\nfinal BlockingQueue response = new ArrayBlockingQueue<>(1);\n\nString ctag = channel.basicConsume(replyQueueName, true, (consumerTag, delivery) -> {\nif (delivery.getProperties().getCorrelationId().equals(corrId)) {\nresponse.offer(new String(delivery.getBody(), \"UTF-8\"));\n}\n}, consumerTag -> {\n});\n\nString result = response.take();\nchannel.basicCancel(ctag);\n\nSystem.out.println(result);\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Messaging", "Header 2": "RabbitMQ", "Header 3": "Publish and Subscribe", "path": "../pages/digitalGarden/cs/distributedSystems/messaging.mdx"}, "page_content": "```java\n// PUBLISHER\nchannel.exchangeDeclare(EXCHANGE_NAME, \"fanout\");\nString message = \"Current Date: \" + LocalDateTime.now();\nchannel.basicPublish(EXCHANGE_NAME,\n/* routing key: */ \"\", //!!!!!!\n/* properties: */ null,\n/* body: */ message.getBytes(StandardCharsets.UTF_8));\n\n// SUBSCRIBER\nchannel.exchangeDeclare(EXCHANGE_NAME, \"fanout\");\nString queueName = channel.queueDeclare().getQueue();\nchannel.queueBind(queueName, EXCHANGE_NAME, \"\");\nDeliverCallback deliverCallback = (consumerTag, delivery) -> {\nString message = new String(delivery.getBody(), \"UTF-8\");\nSystem.out.println(\"Received '\" + message + \"'\");\n};\nchannel.basicConsume(queueName,\n/* autoAck */ true, deliverCallback, consumerTag -> {});\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "IETF", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "IETF stands for Interent Engineering Task Force which is task force/organization with the goal to make the internet work better. They define internet standards and regulate the internets architecture with so called RFC specifications (Request for comments). Some of the most well known RFCs are the following \n- RFC 791: [Interent Protocol](https://datatracker.ietf.org/doc/html/rfc791)\n- RFC 5322: [Email (SMTP)](https://datatracker.ietf.org/doc/html/rfc5322)\n- RFC 2549: [IP over Avian Carriers (IPoAC)](https://datatracker.ietf.org/doc/html/rfc2549)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "HTTP Protocol", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "The HTTP protocol stands for Hypertext transfer protocol and is used to access static or dynamic data on another computer and is based on a reliable transport layer protocols like TCP and IP. \n![httpProtocol](/compSci/httpProtocol.png)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "HTTP Protocol", "Header 3": "Request", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "A HTTP request consists of a request line which holds the method of the request (GET, POST etc.), the url of the target and the version of the HTTP protocol to be used followed by a carriage return line feed (CR LF). You then have the headers which are key-value pairs separated by ':' and a CR LF at the end of each one. Last but not least you have the body which is a chunk of bytes \n![httpRequest](/compSci/httpRequest.png) \n#### Methods \nIdempotent means that multiple identical requests will have the same outcome. So it does not matter if a request is sent once or multiple times. The following HTTP methods are idempotent: GET, HEAD, OPTIONS, TRACE, PUT and DELETE. \n![httpRequestMethods](/compSci/httpRequestMethods.png) \n#### Headers \n| Key | Description | Example |\n| --------------- | ------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------- |\n| Host | specifies the host and port number of the server to which the request is being sent | Host: developer.mozilla.org:8080 |\n| Accept | indicates which content types, expressed as MIME types, the client can understand | Accept: text/html |\n| Accept-Language | indicates the natural language and locale that the client prefers | Accept-Language: de-CH |\n| Accept-Encoding | indicates the content encoding (usually a compression algorithm) that the client can understand | Accept-Encoding: gzip |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "HTTP Protocol", "Header 3": "Request", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "| Accept-Encoding | indicates the content encoding (usually a compression algorithm) that the client can understand | Accept-Encoding: gzip |\n| User-Agent | lets servers and network peers identify the application, operating system, vendor, and/or version of the requesting user agent | User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:47.0) |\n| Referer | contains an absolute or partial address of the page that makes the request | Referer: https://example.com/ |\n| Connection | controls whether the network connection stays open after the current transaction finishes | Connection: keep-alive, Connection: close |\n| Cookie | contains stored HTTP cookies associated with the server | Cookie: PHPSESSID=298zf09hf012fh2; |\n| Content-Length | indicates the size of the message body, in bytes, sent to the recipient | Content-Length: 4 |\n| Content-Type | indicates the original media type of the resource (prior to any content encoding applied for sending) | Content-Type: text/html; charset=UTF-8 |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "HTTP Protocol", "Header 3": "Response", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "A HTTP reponse is built very similiar to a request but instead of a request line it has a status line which also holds the version of the HTTP protocol to be used followed by a status code and message.\n![httpResponse](/compSci/httpResponse.png) \n#### Codes \n![httpReponseCodes](/compSci/httpReponseCodes.png) \n#### Headers \n| Key | Description | Example |\n| ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------- |\n| Content-Length | indicates the size of the message body, in bytes, sent to the recipient | Content-Length: 4 |\n| Content-Type | indicates the original media type of the resource (prior to any content encoding applied for sending) | Content-Type: text/html; charset=UTF-8 |\n| Content-Encoding | lists any encodings that have been applied to the representation (message payload), and in what order | Content-Encoding: gzip |\n| Location | indicates the URL to redirect a page to. It only provides a meaning when served with a 3xx (redirection) or 201 (created) status response | Location: /index.html |\n| Date | contains the date and time at which the message originated | Date: Wed, 21 Oct 2015 07:28:00 GMT |\n| Last-Modified | contains a date and time when the origin server believes the resource was last modified | Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "HTTP Protocol", "Header 3": "Response", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "| Last-Modified | contains a date and time when the origin server believes the resource was last modified | Last-Modified: Wed, 21 Oct 2015 07:28:00 GMT |\n| Expires | contains the date/time after which the response is considered expired | Expires: Wed, 21 Oct 2015 07:28:00 GMT |\n| Server | describes the software used by the origin server that handled the request | Server: Apache/2.4.1 (Unix) |\n| Transfer-Encoding | specifies the form of encoding used to safely transfer the payload body to the user | Transfer-Encoding: chunked |\n| Cache-Control | control caching in browsers and shared caches | Cache-Control: no-cache |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "HTTP Protocol", "Header 3": "MIME Types", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "The Content-Type or MIME (Multipurpose Internet Mail Extensions) type specifies type of the body, like text/javascript or something else like audio, video, etc. being sent between client and server. MIME types are not limited to HTTP, they are used in many other locations. \n`Media-Type = type / subtype { “;” parameter }` \nTypes: text / image / audio / video / application / message / multipart \nSubtypes that start with x are non standard subtypes. \nFor example: \n- Media-Type: text/html;charset =ISO 8859 1\n- Media-Type: application/octet stream\n- Media-Type: image/jpeg", "type": "Document"} -{"id": null, "metadata": {"Header 1": "The Internet", "Header 2": "HTTP Protocol", "Header 3": "Enhancements", "path": "../pages/digitalGarden/cs/distributedSystems/internet.mdx"}, "page_content": "Over the years the HTTP protocol has been enhanced and newer versions have been released/specified. \n#### Version 1.1 \n- Network connection management\n- Persistent connections were introduced, meaning that several requests can be sent over the same connection\n- Pipelining was introduced, meaning you can send a new request before the previous ones have even been answered \n- Bandwidth optimization\n- Clients can request parts/ranges of documents for example to complete a previously interrupted request\n- Message transmission\n- Trailers were introduced, meaning message headers can be delivered at the end of the body which can also be similar to a checksum\n- Transfer encoding and content length. Clients reading a resposne need to know when they have reached the end. Servers can indicate the end of a message in four ways\n- Implied content length, for example certain response codes like 304 are defined to never have content, so the client can assume the response to terminate with a double CR LF\n- Content-length header, the length of the content is specified in the content-length attribute in bytes.\n- Chunked encoding, the content is broken down into a number of chunks each prefixed by its size in bytes, a zero size chunk then indicated the end of the message. For this to work the server must set the header to `transfer-encoding : chunked`.\n- Internet address conservation \n#### Version 2.0 \n- Binary, packet-based protocol. With the switch to 2.0 all HTTP messages are split and sent in clearly defined frames. This also means that chunked transfer encoding must not be used with HTTP/2.0. This switch provides more mechanisms for data streaming but also allows for more efficiency.\n- Multiple requests can now be sent in parallel over a single TCP connections.\n- HPACK, headers are compressed and cached on the server\n- Servers can push resources together with a requested resources for example a script or css file along with a HTML page.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Euclidean Algorithm", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "The euclidean algorithm is used to find the GCD of two numbers. This can be done locally but also using a distributed algorithm: \n```java\nint gcd(int a, int b) {\nassert a >= 0 && b >= 0;\nwhile(b != 0) {\nint tmp = b;\nb = a % b;\na = tmp;\n}\nreturn a;\n}\n``` \nIn the distributed algorithm each node has a reference to its left and right neighbor. The node first informs its neighbors of its own value. If a received number is smaller than its value then a node adjusts its value and shares its new value with its neighbors. \n```java\npublic class GcdActor extends AbstractActor {\nprivate int n;\nprivate final Set neighbours = new HashSet<>();\npublic GcdActor(int n) {\nthis.n = n;\nSystem.out.printf(\"%s Initial Value: %d%n\", getSelf(), n);\n}\n@Override\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(ActorRef.class, actor -> {\nneighbours.add(actor);\nif(neighbours.size() == 2) {\nneighbours.forEach(a -> a.tell(n, getSelf()));\n}})\n.match(Integer.class, value -> {\nif(value < n) {\nn = ((n-1) % value) + 1;\nneighbours.forEach(a -> a.tell(n, getSelf()));\nSystem.out.printf(\"%s Current Value: %d%n\",\ngetSelf(), n);\n}})\n.matchAny(msg -> unhandled(msg))\n.build();\n}\n}\npublic static void main(String[] args) throws Exception {\nActorSystem as = ActorSystem.create();\nList values = List.of(108, 76, 12, 60, 36);\nList actors = IntStream.range(0, values.size())\n.mapToObj(n -> as.actorOf(Props.create(GcdActor.class, values.get(n)), \"GCD\"+n))\n.collect(Collectors.toList());\nfinal int size = actors.size();\nfor(int i = 0; i < size; i++) {\nactors.get(i).tell(actors.get((i+1) % size), null);\nactors.get(i).tell(actors.get((size+i-1) % size), null);\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Echo Algorithm", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "The idea of the echo algorithm is to traverse an arbitrary graph by implicitly building a spanning tree. The algorithm is defined as followed: \nThere are two types of messages: Explorer messages, which color the nodes red, and Echo messages, which color the nodes green. Before the algorithm is executed, all nodes are white. \n1. An initiator turns red and sends an explorer to all of his neighbors.\n2. A white node that receives an explorer turns red\n3. A node that has received an explorer or an echo over all of its edges turns green\n4. A non-initiator node that has received an explorer or an echo over all of its edges sends an echo over the edge over which it received the first explorer\n5. The algorithm terminates when the initiator turns green \nThe edges over which the echo messages have run result in a spanning tree. For a Graph with $E$ edges this algorithm uses 2 * E messages. \n```java\npublic class EchoNode extends AbstractActor {\nprivate final Set < ActorRef > neighbours = new HashSet < > ();\nprivate ActorRef parent;\nprivate int counter = 0; // number of received tokens\n@Override\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(ActorRef.class, actor -> neighbours.add(actor))\n.match(Start.class, value -> {\nparent = getSender(); // initiator\nneighbours.forEach(a -> a.tell(new Token(), getSelf()));\n})\n.match(Token.class, msg -> {\ncounter++;\nif (parent == null) { // variant: if(counter == 1)\nparent = getSender();\nSystem.out.printf(\"Actor %s got informed by %s%n\",\ngetSelf(), getSender());\nneighbours.stream()\n.filter(a -> a != parent)\n.forEach(a -> a.tell(msg, getSelf()));\n}\nif (counter == neighbours.size()) {\nparent.tell(msg, getSelf());\n}\n})\n.matchAny(msg -> unhandled(msg))\n.build();\n}\npublic static void main(String[] args) throws Exception {\nActorSystem as = ActorSystem.create();\nList < ActorRef > actors = IntStream.range(0, 8)\n.mapToObj(n -> as.actorOf(Props.create(EchoNode.class), \"Node\" + n))\n.collect(Collectors.toList());\naddEdge(actors, 0, 1);\n...\naddEdge(actors, 7, 5);", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Echo Algorithm", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "ActorSystem as = ActorSystem.create();\nList < ActorRef > actors = IntStream.range(0, 8)\n.mapToObj(n -> as.actorOf(Props.create(EchoNode.class), \"Node\" + n))\n.collect(Collectors.toList());\naddEdge(actors, 0, 1);\n...\naddEdge(actors, 7, 5);\nTimeout timeout = new Timeout(5, TimeUnit.SECONDS);\nFuture < Object > f = Patterns.ask(actors.get(0), new Start(), timeout);\nObject result = Await.result(f, timeout.duration());\nSystem.out.println(result);\nas.terminate();\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Election Algorithm", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "The idea of the election algorithm is to elect a leader among equal nodes for example to coordiante concurrency etc. For the alogrithm to work we need to assume that each node has a unique identifier that can be ordered. The node with the highest order is then the leader. At the end each node should know who the leader is. \nInitially the value in each node is negative infinity.\nEvery node can start the election as long as it is not yet involved in an election, i.e. value is neg inf.\nUpon start, a node stores its id number in the value field and sends this value to the next node.\nIf a message arrives, its value is compared with the stored one. If it is greater than the stored value, the value is updated and the message is\nforwarded. If it is smaller, then the message is discarded.\nA node is leader if it receives its own message. The leader then may inform the other nodes about the election / termination of\nthe algorithm. \n```java\npublic class ElectionNode extends AbstractActor {\nprivate ActorRef next; // ring references\nprivate ActorRef initiator; // initiator of the election\nprivate final int id; // id of this actor\nprivate int master = Integer.MIN_VALUE; // id of elected node\npublic ElectionNode(int id) {\nthis.id = id;\n}\n@Override\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(ActorRef.class, actor -> next = actor)\n.match(Start.class, value -> {\nif (master == Integer.MIN_VALUE) {\ninitiator = getSender();\nmaster = id;\nnext.tell(new Token(master), getSelf());\n}\n})\n.match(Token.class, token -> {\nif (token.value > master) {\nmaster = token.value;\nnext.tell(token, getSelf());\n} else if (token.value == id) {\nSystem.out.println(\"hurray, I got elected \" + getSelf());\nnext.tell(new Reset(id), getSelf());\n}\n})\n.match(Reset.class, token -> {\nmaster = Integer.MIN_VALUE;\nif (token.value == id) {\ninitiator.tell(\"\" + id, getSelf());\n} else {\nnext.tell(token, getSelf());\n}\n})\n.matchAny(msg -> unhandled(msg))\n.build();\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Hash Tables", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "Distributed system that provides a lookup service like a hash table. Key Value pairs are stored in the nodes of a DHT\nAny participating node can efficiently retrieve the value associated with a given key. The tricky part is figuring out which node is responsible for which key? How do we handle changes to the network topology? Nodes can join or leave the network at any time.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Hash Tables", "Header 3": "Consistent Hashing", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "The goal of consistent hashing is to minimize the number of remaps if nodes are added or removed i.e. if the table is resized we only want $\\frac{n}{m}$keys need to be remapped on average with $n$ = number of keys and $m$ = number of nodes.\nKeys and Nodes are mapped to the same ID space (Integers) Nodes: hash(IP), Keys: hash(key). Hash Functions: \n- SHA 1 => 160bit $2^160$ possible nodes\n- Java => 32bit $2^32$ possible nodes \nEach object (keys and nodes) is mapped to a point on a circle for example: if we use 6bit objects then we have the ID space: 0 .. $2^6 - 1$ = 63). Each key is stored at its successor, i.e. in the node with the next higher or equal ID. This has the following advantages: \n- All nodes store roughly the same number of keys if the hash function is uniform.\n- If a node joins or leaves, only a fraction of the keys need to be moved to a different node, i.e. only the successor\nof a node is involved. \nThis technique can be implemented in different ways. Either we have a complete graph so each node knows the location of every other node which leads to a lookup complexity of $O(1)$ but storage of the routing table takes up $O(n)$ with $n$ being the number of nodes. \n```java\nrecord Put(Object key, Object value) {} // used to initate a put\nrecord Put2(Object key, Object value) {} // used to store at dest node\nrecord Get(Object key) {} // used to initate a get\nrecord Get2(Object key) {} // used to get at dest node\nrecord Result(int id, Object value) {} // used to return result\nrecord AddNode(int id, ActorRef actor) {} // initiates an add node\nrecord Partition(int id) {} // used to partition a node\nrecord PartitionAnswer(Map < Object, Object > map) {}\n// answer to a partition request\nrecord Print() {} // debugging, i.e. print node info on console\npublic class HashNode extends AbstractActor {\nprivate final int id; // id of this node\n// references to all actors\nprivate final TreeMap < Integer, ActorRef > actors = new TreeMap < > ();\n// data stored in this node", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Hash Tables", "Header 3": "Consistent Hashing", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "public class HashNode extends AbstractActor {\nprivate final int id; // id of this node\n// references to all actors\nprivate final TreeMap < Integer, ActorRef > actors = new TreeMap < > ();\n// data stored in this node\nprivate final Map < Object, Object > values = new HashMap < > ();\npublic HashNode(int id) {\nthis.id = id;\n}\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(Map.class, actors -> {\nthis.actors.putAll(actors);\n})\n.match(Get.class, msg -> {\nvar keys = actors.navigableKeySet();\nvar key = keys.ceiling(msg.key().hashCode());\n// ceiling returns the least element in this set\n// greater than or equal to the given element,\n// or null if there is no such element.\nif (key == null) key = keys.first();\nactors.get(key).tell(new Get2(msg.key()), getSender());\n})\n.match(Get2.class, msg -> {\ngetSender().tell(new Result(id, values.get(msg.key())),\ngetSelf());\n})\n.matchAny(msg -> {\nunhandled(msg);\n})\n.build();\n}\n}\n``` \nOr we can have a cyclic graph so each node only knows the location of its successor, this leads to a lookup complexity of $O(n)$ but storage only uses $O(1)$. \n```java\nrecord Put(Object key, Object value) {} // used to initate a put\nrecord Put2(Object key, Object value, int previousId) {}\n// used to distrbibute put in the ring\nrecord Get(Object key) {} // used to initate a get\nrecord Get2(Object key, int previousId, int counter) {}\n// used to distribute get in the ring\nrecord Result(int id, Object value, int counter) {}\nrecord SetNext(int nextId, ActorRef next) {}\nrecord AddNode(int newId, ActorRef newActor) {}\nrecord Partition(int id) {} // used to partition a node, i.e. return\n// all elements <= id\nrecord PartitionAnswer(Map < Object, Object > map) {}\nrecord Print(ActorRef start) {} // print node info on console\npublic class HashNode extends AbstractActor {\nprivate final int id; // id of this node\nprivate ActorRef next; // next node in the ring\nprivate int nextId; // id of next node in ring\nprivate Map < Object, Object > values = new HashMap < > (); // data\npublic HashNode(int id) {", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Hash Tables", "Header 3": "Consistent Hashing", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "private final int id; // id of this node\nprivate ActorRef next; // next node in the ring\nprivate int nextId; // id of next node in ring\nprivate Map < Object, Object > values = new HashMap < > (); // data\npublic HashNode(int id) {\nthis.id = id;\n}\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(SetNext.class, msg -> {\nnext = msg.next();nextId = msg.nextId();\n})\n.match(Put.class, msg -> {\nnext.tell(new Put2(msg.key(), msg.value(), this.id), null);\n})\n.match(Get.class, msg -> {\nnext.tell(new Get2(msg.key(), this.id, 1), getSender());\n})\n.match(Put2.class, msg -> {\nint hash = msg.key().hashCode();\nif (between(hash, msg.previousId(), this.id)) {\nvalues.put(msg.key(), msg.value());\n} else {\nnext.tell(new Put2(msg.key(), msg.value(), this.id), null);\n}\n})\n.match(Get2.class, msg -> {\nint hash = msg.key().hashCode();\nif (between(hash, msg.previousId(), this.id)) {\ngetSender().tell(new Result(id, values.get(msg.key()),\nmsg.counter()), getSelf());\n} else {\nnext.tell(new Get2(msg.key(), id, msg.counter() + 1),\ngetSender());\n}\n})\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Hash Tables", "Header 3": "Chord Algorithm", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "The chord algorithm and protocol implements a distributed hash table with a lookup time of log(N) and is based on Consistent Hashing. It uses so called finger tables. In these tables every node knows up to $m$ other nodes, and the distance of the nodes it knows increases exponentially (m is the bit length\nof the hash function). Meaning the The i-th entry (0..m-1) in the table of node n contains a reference to the successor $((n + 2^i ) \\mod 2^m)$ the first entry of the finger table is the immediate successor. Example: 16 node Chord network (m = 4). \n#### Lookup \nThe finger table is used to find the predecessor of the node which stores a given key. \n1. Node 10 is asked to look up key 5 =\\> Finger table refers to node 43.\n2. Node 43 is asked to look up key 5 =\\> Finger table refers to node 1\n3. Node 1 is asked to look up 5 =\\> Key is between 2 and 10 (1 \\< 5 \\<= 10), so Node 10 contains the searched key and its associated value \n```c\nn.find_successor(id)\nif id in (n, successor] then // n < id && id <= successor\nreturn successor // this is the node which contains key id\nelse\n// forward the query around the circle\nn0 = closest_preceding_node(id)\nreturn n0.find_successor(id)\n// search the local table for the highest predecessor of id\nn.closest_preceding_node(id)\nfor (int i = m - 1; i >= 0; i--)\ndo\nif (finger[i] in (n, id)) then\nreturn finger[i]\nreturn n\n``` \n#### Join \nIf a new node joins, the following invariants must be maintained: \n- Each node refers to its immediate successor =\\> ensures correctness\n- Each ( key,value ) pair is stored in successor(hash(key)) =\\> ensures correctness\n- The finger table of each node should be correct =\\> keeps query operation fast \n```java\nrecord Put(Object key, Object value) {} // used to distribute in ring\nrecord Put2(Object key, Object value) {} // put in destination node\nrecord Get(Object key, int counter) {} // used to distribute in ring\nrecord Get2(Object key, int counter) {} // get in destination node\nrecord Result(int id, Object value, int counter) {}", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Hash Tables", "Header 3": "Chord Algorithm", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "record Put2(Object key, Object value) {} // put in destination node\nrecord Get(Object key, int counter) {} // used to distribute in ring\nrecord Get2(Object key, int counter) {} // get in destination node\nrecord Result(int id, Object value, int counter) {}\nrecord Partition(int id) {} // used to partition a node, i.e. return\n// all elements <= id\nrecord PartitionAnswer(Map < Object, Object > map) {}\nrecord Print() {} // debugging, i.e. print node info on console", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Distributed Algorithms", "Header 2": "Distributed Hash Tables", "Header 3": "Chord Algorithm", "path": "../pages/digitalGarden/cs/distributedSystems/distributedAlgorithms.mdx"}, "page_content": "public class HashNode extends AbstractActor {\nprivate final int id; // id of this node\nprivate int next; // id of next node\nprivate TreeMap < Integer, ActorRef > fingerTable;\nprivate Map < Object, Object > values = new HashMap < > ();\npublic HashNode(int id) {\nthis.id = id;\n}\npublic Receive createReceive() {\nreturn receiveBuilder()\n.match(TreeMap.class, fingerTable -> {\nthis.fingerTable = fingerTable;\n})\n.match(Integer.class, next -> this.next = next)\n.match(Get.class, msg -> {\nint hash = msg.key().hashCode();\nif (between(hash, id, next)) {\nfingerTable.get(next).tell(\nnew Get2(msg.key(), msg.counter() + 1), getSender());\n} else {\nvar set = fingerTable.navigableKeySet();\nvar prev = set.lower(hash);\nif (prev == null) prev = set.last();\nfingerTable.get(prev).tell(\nnew Get(msg.key(), msg.counter() + 1), getSender());\n}\n})\n.match(Get2.class, msg -> {\ngetSender().tell(new Result(this.id,\nvalues.get(msg.key()), msg.counter()), getSelf());\n})\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "REST stands for \"Representational State Transfer\" and is a architecture for distributed systems. The term REST originated in Roy Fielding's PhD in 2000 who was one of the main authors of the HTTP protocol specification. REST does not enforce any rules regarding how it should be implemented however it does define some design guidelines/constraints that should be followed if the system is to be truly RESTful.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Constraints", "Header 3": "Uniform Interface", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "Resources (data) are the key abstraction in REST. The interface of the API should be uniform meaning there shouldn't be lots of different ways of doing the same thing which also means that a resource in the system should only have one logical URI. The resource should however not be too large but still contain everything in its representation. Whenever relevant, a resource should also contain links pointing to relative URIs to fetch related information (HATEOAS = Hypermedia as the Engine of Application State).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Constraints", "Header 3": "Client-Server", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "Client applications and server applications must be able to evolve separately without any dependency on each other. For this reason you often see versioning of APIs so that clients are reverse compatible.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Constraints", "Header 3": "Stateless", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "All client-server interactions should be stateless. This means the server does not store anything about the latest HTTP request that the client made and will treat every request as new.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Constraints", "Header 3": "Cacheable", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "Caching has large performance benefits for the client but also reduces the load of the server. So in REST, resources should be cached then declare themselves cacheable.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Constraints", "Header 3": "Layered System", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "REST allows you to use a layered system architecture where you deploy the APIs (Controllers) on server A, and store data on server B and authenticate requests in Server C (Services), for example. A client should not be able to tell whether it is connected directly to the end server or an intermediary along the way.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Common Patterns", "Header 3": "Collection and Element structure", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "Resources are often structure using urls for collections or singular elements. This especially helps to fullfil the constraint of having a uniform interface. In the below example a single product is an element and all products make up a collection. \n| Request | Description |\n| --------------------- | ----------------------------------------------------------------------------------------------------------------- |\n| GET `/products` | List all elements in the collection (products) |\n| POST `/products` | Add a product to the collection |\n| DELETE `/products` | Remove the collection including all of its elements |\n| GET `/products/id` | Read the element (product) with its unique identifier=`id` |\n| PUT `/products/id` | Update the element (with the updated item in the body, often without the unique identifier as already in the URI) |\n| DELETE `/products/id` | Remove the element corresponding to the id |", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Common Patterns", "Header 3": "Put vs Patch", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "Their is often the discussion of what the differences are between the PATCH and the PUT methods. The biggest difference is that PUT is idempotent and PATCH isn't meaning it can cause side effects. The other key difference is that PUT sends the modified version of the resource whereas PATCH just sends instructions describing how a resource should be modified (most often just the to be modified fields).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Jakarta", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "What is jakarta?", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Jakarta", "Header 3": "Client", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "Jakarta offers some packages that make submitting requests to an API very easy. The general process is as followed: \n1. Obtain an instance of a client.\n2. Create and configure a WebTarget which represents the API.\n3. Create and configure a request from the WebTarget.\n4. Submit the request. \n```java\npublic class RestClient {\n\nprivate static final String REST_URI = \"http://localhost:3001/api/v1/\";\n\npublic static void main(String[] args){\nClient client = ClientBuilder.newClient();\nWebTarget rootTarget = client.target(REST_URI); // immutable with respect to URI\nWebTarget productsTarget = rootTarget.path(\"products\"); // mutable with respect to configuration\n\nInvocation.Builder invocationBuilder = productsTarget.request(MediaType.APPLICATION_JSON);\nResponse getResponse = invocationBuilder.get(Product.class);\nProduct product = new Product(\"Logitech mouse\", 3); // title, amount\nResponse postResponse = invocationBuilder.post(Entity.entity(product, MediaType.APPLICATION_JSON);\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Jakarta", "Header 3": "RESTful API with JAX-RS", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "JAX-RS is a specification of annotations for server services. With them we can create a RESTful API. For this we create singletons that bind to a certain http method (method binding). \n#### Injection \nWith injections we can extract values from the request. These values can then also be automatically converted to the correct type. This automatic type conversion is possible from string to primitive types, to class `T` that either has a constructor with a single string parameter or a static method `T valueOf(String arg)`. \nYou can also get various context objects: \n- `@Context UriInfo` \n```java\n@Singleton\n@Path(\"/products\")\npublic class ProductResource{\n\n@Context\nprivate UriInfo info; // also HttpHeaders, Request, SecurityContext, Providers etc.\n@GET\n@Path(\"{id}\")\npublic Response getProduct(@PathParam(\"id\") String id) {\n// For complexer responses\nString product = ...; // get from DB or whatever\nResponseBuilder builder = Response.ok(book); // body\nbuilder.language(\"en\").header(\"Some-Header\", \"some value\");\n\nreturn builder.build();\n}\n\n@POST\n@Path(\"{title}-{amount}\")\npublic String createProduct(\n@PathParam(\"title\") String title,\n@PathParam(\"amount\") int amount,\n@DefaultValue(10) @QueryParam(\"price\") int price,\n@HeaderParam(\"Referer\") String referer,\n@CookieParam(\"customerId\") Cookie customerId )\n) { ... }\n\n@PUT\n@Path(\"{id}\")\npublic String updateProduct(@PathParam(\"id\") String productID, String body) { ... }", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Jakarta", "Header 3": "RESTful API with JAX-RS", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "@PUT\n@Path(\"{id}\")\npublic String updateProduct(@PathParam(\"id\") String productID, String body) { ... }\n\n@DELETE\n@Path(\"{id}\")\npublic void deleteProduct(@PathParam(\"id\") String productID) { ... }\n}\n``` \n#### Content Negotiation \nYou can also use the `@Produces` annotation to declare the type of result. \n```java\n@Produces({\"text/plain\", \"text/html\"})\n@Produces({\"application/xml\", \"application/json\"})\n``` \nThe `@Consumes` annotation does something very similiar and declare the type which is accepted. \n```java\n@Consumes(\"application/x-www-form-urlencoded\")\n@Consumes({\"application/xml\", \"application/json\"});\n``` \n#### Content Handlers \nData binding or also often called marshalling is the process of converting data from or to the body. Above we have used string but you can also use a byte[], InputStream etc.. If a request sends data using `\"application/x-www-form-urlencoded\"` you can read it in with a `MultivalueMap` and/or `MessageBodyWriter`. To be able to use the provider then in a client you need to register it. \n```java\nClient c = ClientBuilder.newClient();\nc.register(XStreamProvider.class);\n``` \n```java\n@Provider\n@Consumes(\"application/xstream\")\n@Produces(\"application/xstream\")\npublic class XStreamProvider implements MessageBodyReader, MessageBodyWriter {\n\nprivate XStream xstream = new XStream (new DomDriver());\n\npublic boolean isReadable(Class type, Type genericType,\nAnnotation[] annotations, MediaType mimeType) {\nreturn true;\n}\n\npublic Object readFrom(Class type, Type genericType,\nAnnotation[] annotations, MediaType mimeType,\nMultivaluedMap httpHeaders, InputStream entityStream) {\nreturn xstream.fromXML(entityStream);\n}\n\npublic boolean isWriteable(Class type, Type genericType,\nAnnotation[] annotations, MediaType mimeType) {\nreturn true;\n}", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Jakarta", "Header 3": "RESTful API with JAX-RS", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "public boolean isWriteable(Class type, Type genericType,\nAnnotation[] annotations, MediaType mimeType) {\nreturn true;\n}\n\npublic long getSize (Object object , Class type,\nType genericType, Annotation[] annotations, MediaType mimeType) {\nreturn -1; // size not yet known\n}\n\npublic void writeTo(Object object, Class type, Type genericType,\nAnnotation[] annotations, MediaType mimeType,\nMultivaluedMap httpHeaders, InputStream entityStream) {\nreturn xstream.toXML(object, entityStream);\n}\n}\n``` \n#### Conditional Get \nFor performance reasons we don't want to transfer resources if they have not changed which is why we can do conditional GETs two different ways with the help of headers. \nWhen sending a response we can add the \"Last-Modified\" header and then when sending a request for the same resource we can use the value in the \"If-Modified-Since\" Header. This can then either return with a modified value or with a 304, not modified status. \nWe can also use the \"ETag\" and \"If-None-Match\" headers which work pretty much the same. The ETag (entity tag) value is an identifier which represents a specific version of the resource. Common methods of ETag generation are using a hash of the resource's content or just hash of the last modification timestamp. \n```java\nDate lastModifiedDate = ...\n// EntityTag eTag = ...\nResponse.ResponseBuilder responseBuilder = request.evaluatePreconditions(lastModifiedDate);\nif (responseBuilder == null) {//last modified date didn't match, send new content\nreturn Response.ok(\"dummy user list\")\n.lastModified(lastModifiedDate)\n//.tag(tag)\n.build();\n} else {\n\nreturn responseBuilder.build(); //sending 304 not modified\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Jakarta", "Header 3": "Deployment with Jersey", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "You need to register Jersey as the servlet dispatcher for REST requests in the `web.xml` file. \n```xml\n\n\ncom.vogella.jersey.first\n\nJersey Web Application\norg.glassfish.jersey.servlet.ServletContainer\n\n\njavax.ws.rs.Application\nch.georgerowlands.MyApplication\n\n\n\nJersey Web Application\n/*\n\n\n``` \nAnd then add your services to the Application \n```java\npublic class MyApplication extends Application {\nprivate Set singletons = new HashSet();\nprivate Set> classes = new HashSet>();\n\npublic MyApplication () {\nclasses.add(ProductResource.class);\nsingeltons.add(new ProductResource());\n}\n\n@Override\npublic Set> getClasses () { return classes; }\n\n@Override\npublic Set getSingletons () { return singletons; }\n}\n\npublic class Server {\npublic static void main(String[] args ) throws Exception {\nfinal URI BASE_URI = new URI(\"http://localhost:9998\");\nResourceConfig rc = ResourceConfig.forApplication(new MyApplication());\n// Resource config that scans for JAX RS resources so need for application\n// ResourceConfig rc = new ResourceConfig().packages(\"ch.georgerowlands.resources\");\nJdkHttpServerFactory.createHttpServer(BASE_URI, rc);\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "RESTful APIs", "Header 2": "Jakarta", "Header 3": "Documenting with OpenAPI", "path": "../pages/digitalGarden/cs/distributedSystems/restfulApis.mdx"}, "page_content": "The OpenAPI Specification (OAS) defines a standard interface description for REST APIs. Swagger is a set of open-source tools that are built around OpenAPI. \n- Swagger Annotations: Annotations that can be added to Java implementations to generate OpenAPI specifications.\n- Swagger Editor: Editor for writing OpenAPI specifications.\n- Swagger UI: Renders OpenAPI specifications into an interactive API documentation with which REST services can be tested.\n- Swagger Codegen: Generates server and clients from an OpenAPI specification. \n```java\n@Singleton\n@Path(\"/products\")\n@OpenAPIDefinition(\ninfo = @Info(\ntitle =\"Products\",\ndescription =\"Service to manage products\",\nversion = \"2022.05\"\n),\nservers = @Server(url = \"http://localhost:3001\")\n)\npublic class ProductResource{\n\n@GET\n@Path(\"{id}\")\n@Operation(\nsummary = \"Get product by id\",\ndescription = \"Returns a single product\",\nresponses = {\n@ApiResponse(responseCode = \"200\",\ndescription = \"Successful operation\",\ncontent = @Content(\nschema = @Schema(implementation = Product.class)\n)),\n@ApiResponse(responseCode = \"404\",\ndescription = \"Product not found\"\n),\n}\n)\npublic Response getProduct(@PathParam(\"id\") String id) { ... }\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "General Knowledge", "Header 2": "Definitions", "path": "../pages/digitalGarden/cs/distributedSystems/generalKnowledge.mdx"}, "page_content": "A distributed system is a set of interacting active components which are located in different locations and realize a common application. Each active component has its it's own independent set of instructions which means they can also run in parallel/concurrently. The location of an active component can be physically different to its other nodes in the system but it can also just be logically for example a different process.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "General Knowledge", "Header 2": "Advantages and disadvantages", "path": "../pages/digitalGarden/cs/distributedSystems/generalKnowledge.mdx"}, "page_content": "Distributed systems can be used by multiple users at the same time that can interact with each other. \nDue to the concurrent nature of distributed systems it can also come with improvements in performance, scalability and use of idle resources. \nDepending on your design of the system you can also achieve higher reliability, stability and fault tolerance. For example if you have two of the same component running on different machines in different locations and one goes down you still have the other one running as a backup. Between the two components you can also split up the load instead of having one component constantly overloaded. \nDistributed systems do however come with a multitude of disadvantages. \n- If the components are located in different locations or on different machines you are depending on several physical components however you also do not have a single point of failure.\n- Designing the system can be much more complex as you might need more complex algorithms to manage consistency problems between the components and also have to take extra security precautions.\n- You also need to make sure that deployment can be orchestrated cleanly and that versioning is done correctly.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "General Knowledge", "Header 2": "Client and server model", "path": "../pages/digitalGarden/cs/distributedSystems/generalKnowledge.mdx"}, "page_content": "Clients send requests to servers and therefore actively initialize communication with the server. In most cases Clients work with multiple servers at the same time. \nServers provide some service/functionality and wait passively for requests from clients and can typically handle the requests concurrently or with queues. Server don't necessarily have to be on separate devices. One device can have multiple servers running at the same time. \n![clientServerModel](/compSci/clientServerModel.png)", "type": "Document"} -{"id": null, "metadata": {"Header 1": "General Knowledge", "Header 2": "Communication", "Header 3": "Synchronous and Asynchronous", "path": "../pages/digitalGarden/cs/distributedSystems/generalKnowledge.mdx"}, "page_content": "Synchronous communication happens when messages between sender and receiver are exchanged in real time. An example of synchronous communication is human communication like a phone call or video meeting. \nAsynchronous communication happens when messages can be exchanged independent of time. It doesn’t require the receiver's immediate attention, allowing them to respond to the message at their convenience. Examples of asynchronous communication are emails, online forums etc. \n![syncAndAsync](/compSci/syncAndAsync.png) \nInteresting you can emulate asynchronous communication with synchronized calls and vice versa. \n```java title=\"Emulation of async call\"\nid = service.submit(args);\n// do something else\nif(service.isReady(id)){\nres = service.getResult(id);\n}\n``` \n```java title=\"Emulation of sync call\"\nex.submit(task, handler);\nwhile(!handler.isReady(id)){\n// busy waiting\n}\nres = handler.getResult(id);\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "General Knowledge", "Header 2": "Communication", "Header 3": "Styles", "path": "../pages/digitalGarden/cs/distributedSystems/generalKnowledge.mdx"}, "page_content": "#### Remote procedure calls - RPC \nWhen using this style of communication one process calls a procedure (subroutine or service) to execute in a different address space than its own. The procedure may be on the same system or a different system connected on a network and in most cases are synchronous. System calls in the unix operating systems are an example of this. \nYou can differentiate between Procedural RPC where the server provides a set of operations and is typically stateless for example GraphQL and Object oriented RPC where the server hosts a set of object and typically has it's own state for example RMI. \n#### Massage based systems \nIn message based systems like MQTT or RabbitMQ information is exchanged through messages. These messages can either be synchronous when exchanged over TCP or UDP, or they can be asynchronous for example in the case of MQTT where you have subscriptions. \n![messageSystem](/compSci/messageSystem.png) \n#### Shared repository \nShared repository systems provide small interfaces which allow tuples to be created, read and deleted. REST would come under this category.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Records", "path": "../pages/digitalGarden/cs/java/records.mdx"}, "page_content": "With JDK 14 [record classes](https://docs.oracle.com/en/java/javase/18/language/records.html) were introduced, which are\na new kind of type declaration. They are especially useful for passing around immutable data containers. For example\nconsider the immutable class below. \n```java filename=\"Rectangle.java\"\npublic final class Rectangle {\nprivate final double length;\nprivate final double width;\n\npublic Rectangle(double length, double width) {\nthis.length = length;\nthis.width = width;\n}\n\n// getters\ndouble length() {\nreturn this.length;\n}\n\ndouble width() {\nreturn this.width;\n}\n\n// Implementation of equals() and hashCode(), which specify\n// that two record objects are equal if they\n// are of the same type and contain equal field values.\npublic boolean equals(Object other) {...}\n\npublic int hashCode() {...}", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Records", "path": "../pages/digitalGarden/cs/java/records.mdx"}, "page_content": "public int hashCode() {...}\n\n// An implementation of toString() that returns a string\n// representation of all the record class's fields,\n// including their names.\npublic String toString() {...}\n}\n``` \nThe following record is equivalent to the above. \n```java filename=\"Rectangle.java\"\nrecord Rectangle(double length, double width) {}\n``` \nA record consists of a name and a list of components (length and width). A record automatically provides the following\nfunctionalities: \n- A private final field for each of its components.\n- A public read accessor (Getter) method for each component with the same name and type of the component (without get,\nso length() not getLength()).\n- A public canonical constructor which initializes all components.\n- Implementations of the equals() and hashCode() methods, which specify that two records are equal if they are of the\nsame type and their components are equal.\n- An implementation of the toString() method that includes the string representation of all the record's components,\nwith their names. For example \"rectangle[length=12, width=10]\". \nThere are however some restrictions when working with records: \n- Records cannot extend any class (because they already extend the Record class just like enums extend the Enum class).\n- Records cannot declare instance fields (apart from the private final fields in the component list).\n- Records cannot extend other records and therefore also can't be abstract because they are implicitly final.\n- The components of a record are implicitly final, they can not be made mutable. \nThere are however some things you can still do: \n- You can declare a record inside a class however a nested record will be implicitly static.\n- You can create generic records\n- Records can implement interfaces\n- You can declare in a record's body static methods, static fields, static initializers, constructors, instance methods,\nand nested types\n- You can annotate records and a record's individual components", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Records", "Header 2": "Canonical and compact constructors", "path": "../pages/digitalGarden/cs/java/records.mdx"}, "page_content": "Records automatically generate a canonical constructor which initializes all components. You can however also define the\ncanonical constructor yourself for example if you want to add validation. \n```java filename=\"Rectangle.java\"\nrecord Rectangle(double length, double width) {\npublic Rectangle(double length, double width) {\nif (length <= 0 || width <= 0) {\nthrow new java.lang.IllegalArgumentException(String.format(\"Invalid dimensions: %f, %f\", length, width));\n}\nthis.length = length;\nthis.width = width;\n}\n}\n``` \nHaving to rewrite the component list as parameters for the constructors can be tiresome and also very error-prone which\nis why compact constructors which were introduced whose signature is derived from the component list. At the end the\ncompact constructor also assigns parameters to the corresponding private fields. \n```java filename=\"Rectangle.java\"\nrecord Rectangle(double length, double width) {\npublic Rectangle {\nif (length <= 0 || width <= 0) {\nthrow new java.lang.IllegalArgumentException(String.format(\"Invalid dimensions: %f, %f\", length, width));\n}\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Dependency Injection and Beans", "path": "../pages/digitalGarden/cs/java/spring/dependencyInjectionBeans.mdx"}, "page_content": "Dependency Injection is a design pattern used in software development and is especially used in the Spring framework.\nDependency Injection allows objects to be created with their dependencies explicitly provided, instead of the objects\nthemselves having to be responsible for the creation of their dependencies. It is a way to achieve loose coupling\nbetween objects and their dependencies. This is also referred to as Inversion of Control, IoC which is another principle\nwhere instead of as in traditional programming, a component would create and control the objects it depends on in IoC the\nresponsibility is shifted to a container or framework, in Springs case the Spring Container. \n![springIOC](/cs/springIoC.png) \nWhen working with Spring Dependency Injection is controlled by the Spring Container which is responsible for creating\nand managing the lifecycle of objects, and injecting their dependencies. The objects are referred to as beans. These\nbeans are then kept track of in the Spring/Application Context which can be fetched as follows: \n```java\n@SpringBootApplication\npublic class MyApplication {\npublic static void main(String[] args) {\nApplicationContext context = SpringApplication.run(MyApplication.class, args);\nfor (String beanName : context.getBeanDefinitionNames()) {\nSystem.out.println(beanName);\n}\n}\n}\n``` \nFor the Spring Container to know what is a bean you use to define them using XML but nowadays, you just annotate Java\nClasses. The parent annotation is `@Bean`, however most of the time you will be using different ones such as `@Entity`,\n`@Component`, `@Controller`, `@Service` which all inherit from it. Once the bean is defined, the container reads manages\nit and can inject it into its dependencies. Dependency injection is also commonly referred to as auto-wiring\nrelationships between beans, which is why in Spring you use the annotation `@Autowired` for a dependency to be injected. \nBeans are found and created most via a component scan. Packages are scanned for annotated classes. Scan is started in", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Dependency Injection and Beans", "path": "../pages/digitalGarden/cs/java/spring/dependencyInjectionBeans.mdx"}, "page_content": "relationships between beans, which is why in Spring you use the annotation `@Autowired` for a dependency to be injected. \nBeans are found and created most via a component scan. Packages are scanned for annotated classes. Scan is started in\npackage of main class and then down the tree so make sure no beans above main package! can also override basePackage to\nstart the scan. \n```java\n@Component(\"fooFormatter\")\npublic class FooFormatter {\npublic String format() {\nreturn \"foo\";\n}\n}\n``` \nWe can then inject the dependency in many ways such as Constructor injection, Setter injection or Field injection.\nConstructor being the best practice and the other being heavily debated: \n```java\n@Component\npublic class FooServiceField {\n@Autowired\nprivate FooFormatter fooFormatter;\n}\n@Component\npublic class FooServiceSetter {\nprivate FooFormatter fooFormatter;\n@Autowired\npublic void setFormatter(FooFormatter fooFormatter) {\nthis.fooFormatter = fooFormatter;\n}\n}\n@Component\npublic class FooServiceConstructor {\nprivate FooFormatter fooFormatter;\n@Autowired // optional\npublic FooServiceConstructor(FooFormatter fooFormatter) {\nthis.fooFormatter = fooFormatter;\n}\n}\n```", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Dependency Injection and Beans", "Header 2": "Qualifiers and Primary Beans", "path": "../pages/digitalGarden/cs/java/spring/dependencyInjectionBeans.mdx"}, "page_content": "use qualifiers so that when interface is to be injected we can choose implementation. Or mark one as primary, i.e the\ndefault one to inject. Show example of in constructor", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Dependency Injection and Beans", "Header 2": "Spring Profiles", "path": "../pages/digitalGarden/cs/java/spring/dependencyInjectionBeans.mdx"}, "page_content": "Can setup different profiles for example in dev you want h2 database and matching services etc. and then for in prod use\npostgres or mysql. Or could also use for different languages, why would an api be language dependent? 2 beans with same\nname but then set different profiles and in application.properties set the active profile. there is a default profile and\nyou can have multiple profiles.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Dependency Injection and Beans", "Header 2": "Spring Bean Lifecycle", "path": "../pages/digitalGarden/cs/java/spring/dependencyInjectionBeans.mdx"}, "page_content": "all beans are made ready before application is considered ready for use. Can hook onto certain events to do certain thigns.\nRarely need to do anything with this stuff. PreDestroy annotated method much more lightly to use.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Dependency Injection and Beans", "Header 2": "Spring Stereotype", "path": "../pages/digitalGarden/cs/java/spring/dependencyInjectionBeans.mdx"}, "page_content": "@Component, Controller etc. A certain set of characteristics expected with a bean. Not always functional could also just\nbe for readability/documentation. thia ahould prob be mentioned somewhere above.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Dependency Injection and Beans", "Header 2": "Bean Scopes", "path": "../pages/digitalGarden/cs/java/spring/dependencyInjectionBeans.mdx"}, "page_content": "default is singelton, one isntance in the spring container. Other possibilites are prototype where new instance for each\nrequest and then more neach scopes such as request, session , global session, application??? and websocket. Could also\nmake custom scope why idk. Set using scope annotation, almsot always singelton is fine.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "From Java to Kotlin", "Header 2": "Kotlin and the JVM", "path": "../pages/digitalGarden/cs/kotlin/fromJava.mdx"}, "page_content": "Kotlin was created by JetBrains in 2011 as an improved alternative to Java. Kotlin is more modern, concise, and safer\nthan Java. Kotlin has gained a lot of popularity as it just like Java but better in many ways, Google even declared\nas the primary language for Android development. \nKotlin was designed to be fully interoperable with Java. This means that you can use Java classes, methods and libraries\nin Kotlin and vice versa. This is possible because Kotlin code can be compiler into Java bytecode. The Kotlin compiler\ngenerates `.class` files just like the Java compiler that can then be executed on the Java Virtual Machine (JVM).", "type": "Document"} -{"id": null, "metadata": {"Header 1": "From Java to Kotlin", "Header 2": "Entry Point", "path": "../pages/digitalGarden/cs/kotlin/fromJava.mdx"}, "page_content": "Just like Java, Kotlin uses a main function as an entry point. However in Java it always needs to be able to handle the program arguments by using either of these possibilities: \n```java\npublic static void main(String args[]) {\nSystem.out.println(\"Hello World!\");\n}\n``` \nor \n```java\npublic static void main(String... args) {\nSystem.out.println(\"Hello World!\");\n}\n``` \nIn Kotlin it is like in C you can either define the program arguements or not, if you don't define them they will just be ignored. \n```kotlin\nfun main() {\nprintln(\"Hello world!\")\n}\n``` \nor \n```kotlin\nfun main(args: Array) {\nprintln(\"Hello world!\")\n}\n``` \nWhere Array is a wrapper with a lot of additional functioanlity for a normal Java array.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "From Java to Kotlin", "Header 2": "Semicolons", "path": "../pages/digitalGarden/cs/kotlin/fromJava.mdx"}, "page_content": "Unlike in Java where semicolons are mandatory to end a statement, in Kotlin semicolons are optional. However, there are\nsome situations where you may need to use semicolons in Kotlin to separate multiple statements on a single line. \n\n\n```kotlin filename=\"Kotlin\"\nval a = 0\nval x = 5; val y = 10; println(x + y)\n```\n\n\n```java filename=\"Java\"\nfinal var a = 0;\nfinal var x = 5; final var y = 10; System.out.println(x + y);\n```\n\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "From Java to Kotlin", "Header 2": "Standard Output", "path": "../pages/digitalGarden/cs/kotlin/fromJava.mdx"}, "page_content": "In Java the `java.lang` package is implicitly imported which contains amongst other things the classes for String, Integer\nand important in this section the System class which is used to output things to the standard output via\n`System.out.println()`. In Kotlin `java.lang` is implicitly imported but so is the `kotlin` package and more importantly\nfor this section the `kotlin.io` package (plus a few other packages). The `kotlin.io` package contains, as with a lot of\nthings in Kotlin the wrapper function `println()` which internally just calls `System.out.println()`. This then allows you\nto not always have to write such verbose code (you can still however of course use `System.out.println()` in kotlin, but don't). \n\n\n```kotlin filename=\"Kotlin\"\nprintln(\"hello world!\")\n```\n\n\n```java filename=\"Java\"\nSystem.out.println(\"hello world!\");\n```\n\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "From Java to Kotlin", "Header 2": "`val` and `var`", "path": "../pages/digitalGarden/cs/kotlin/fromJava.mdx"}, "page_content": "Type inference was introduced in Java 10 with the introduction of the `var` keyword. In Kotlin, you use `var` to define a\nmutable variable. By default, the type can be inferred, but it can also be explicitly defined after `:`. In Kotlin, you can\nalso use the `val` keyword which the same as writing `final var` in Java, i.e. the variable is then immutable (read-only). \n\nThe final keyword exists in Kotlin, however can not be used in conjunction with variables, only to stop inheritance\nof\na class method. I know very confusing.\n \n\n\n```kotlin filename=\"Kotlin\"\nvar a = 1\nvar b // doesn't work, how much memory???\nvar c : Int;\nc = 4\nvar d : Int = 5\nval e = 10\ne = 15 // will fail\n```\n\n\n```java filename=\"Java\"\nvar a = 1; // java also has type inference\nvar b; // also doesn't work\nint c;\nc = 4;\nint d = 5;\nfinal int e = 10;\ne = 15; // will also fail\n```\n\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Ridge Regression", "Header 2": "Bias and Variance", "path": "../pages/digitalGarden/cs/machineLearning/ridge.mdx"}, "page_content": "What exactly is overfitting? 2 factors play a role in overfitting. Bias and Variance. Bias is the difference between\nthe average prediction of our model and the correct value which we are trying to predict. Bias can determine if we\ngot the relationship correct, so we fit the curve correctly. Variance is the variability of model prediction for a\ngiven data point. If a model has no variance then it will fit badly to unseen data. \nThere are a few ways to combat overfitting. One way is to use cross validation which will allow the model to\ngeneralize better to unseen data. Another way is to use regularization. Regularization is a technique that penalizes\ncomplexity. There is then also boosting and bagging. Bagging is used in random forests.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Backpropagation", "Header 2": "Forward Pass", "path": "../pages/digitalGarden/cs/machineLearning/backprop.mdx"}, "page_content": "The forward pass, sometimes also called forward propogration, is the process of calculating the output of a neural network given an input. This is done by running the input through the network layer by layer,\nand applying the activation function to the output of each layer. The output of the last layer is the output of the network. \nSay we have a simple neural network with three linear layers, the input layer of size 2, a hidden layer of size 2 and an output layer of size 1. We will use the\nsigmoid activation function for the hidden layer and the output layer. \n \nEach linear layer is defined by a weight matrix $\\boldsymbol{W}$ and a bias vector\n$\\boldsymbol{b}$. The vector $\\boldsymbol{z}$ contains the pre-activations and the vector $\\boldsymbol{a}$ the activatiosn, i.e. outputs of the layer. \n$$\n\\begin{align*}\n\\boldsymbol{z} &= \\boldsymbol{W}\\boldsymbol{x} + \\boldsymbol{b} \\\\\n\\boldsymbol{a} &= \\sigma(\\boldsymbol{z})\n\\end{align*}\n$$ \nLet's say we have the following input, weights and biases: \n| Variable | Value |\n| -------- | ----- |\n| $x1$ | 0.888 |\n| $x2$ | -0.49 |\n| $w1$ | 1.76 |\n| $w2$ | 0.4 |\n| $w3$ | 0.97 |\n| $w4$ | 2.24 |\n| $w5$ | 1.86 |\n| $w6$ | -0.97 |\n| $b1$ | 0 |\n| $b2$ | 0 |\n| $b3$ | 0 | \nThen we can calculate the output of the network as follows. First we calculate the output of the hidden layer. \nEvaluation Trace or Wengert list??? https://pub.towardsai.net/a-gentle-introduction-to-automatic-differentiation-74e7eb9a75af \n$$\n\\begin{align*}\na1 &= x1w1 + x2w2 + b1 \\\\\n&= 0.888 * 1.76 + -0.49 * 0.4 + 0 \\\\\n&= 1.367\n\\\\\nh1 &= \\sigma(a1) \\\\\n&= \\frac{1}{1 + e^{-a1}} \\\\\n&= \\frac{1}{1 + e^{-1.367}} \\\\\n&= 0.797\n\\end{align*}\n$$ \nIf we do the same for the other neuron we get $a2 = 0.888 * 0.97 + -0.49 * 2.24 + 0 = -0.236$ and $h2 = \\sigma(-0.236) = 0.441$. Now we have our hidden layer outputs,\nwe can calculate the output of the network. \n$$\n\\begin{align*}", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Backpropagation", "Header 2": "Forward Pass", "path": "../pages/digitalGarden/cs/machineLearning/backprop.mdx"}, "page_content": "\\end{align*}\n$$ \nIf we do the same for the other neuron we get $a2 = 0.888 * 0.97 + -0.49 * 2.24 + 0 = -0.236$ and $h2 = \\sigma(-0.236) = 0.441$. Now we have our hidden layer outputs,\nwe can calculate the output of the network. \n$$\n\\begin{align*}\na3 &= h1w5 + h2w6 + b3 \\\\\n&= 0.797 * 1.86 + 0.441 * -0.97 + 0 \\\\\n&= 1.055\n\\\\\ny &= \\sigma(a3) \\\\\n&= \\frac{1}{1 + e^{-a3}} \\\\\n&= \\frac{1}{1 + e^{-1.055}} \\\\\n&= 0.741\n\\end{align*}\n$$ \n \nWe can also write these calculations in matrix form which is more efficient and easier to generalise to larger networks. \n\nDo matrix form\n", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Backpropagation", "Header 2": "Backpropagation", "path": "../pages/digitalGarden/cs/machineLearning/backprop.mdx"}, "page_content": "The backpropagation algorithm is the key to training neural networks. It is the process of calculating the gradients of the loss function with respect to the\nweights and biases of the network. These gradients are then used to update the weights and biases using gradient descent to minimise the loss function. \nThe backpropagation algorithm is based on the chain rule from calculus. So lets start with a brief reminder of the chain rule. \n\nIf we have a the differentiable functions $f(x)$ and $g(x)$ and the composite function $h(x) = f(g(x))$, i.e where\nthe function $f$ is applied to the output of $g$, then the derivative of $h$ with respect to $x$ is given by: \n$$\nh'(x) = f'(g(x))g'(x) \\text{ or } \\frac{dh}{dx} = \\frac{df}{dg}\\frac{dg}{dx}\n$$ \nNotice that the denominator $dg$ is the the same as the following numerator, this can be thougth of as \"the chain\". The chain rule also makes sense\nintuitively, if we think of $dg$ cancelling out in the numerator and denominator. \nIt is a simple but powerful rule that allows us to calculate the derivative of a composite function. For example, if we have $h(x) = (x^2 + 1)^3$,\nthen we can write $h(x) = f(g(x))$ where $f(x) = x^3$ and $g(x) = x^2 + 1$. The derivative of $h$ is then given by: \n$$\n\\begin{align*}\nh'(x) &= f'(g(x))g'(x) \\\\\n&= 3(x^2 + 1)^2 * 2x \\\\\n&= 6x(x^2 + 1)^2\n\\end{align*}\n$$ \nThis also works for more obvious composite functions such as $h(x) = \\sin(x^2 + 1)$. \nThe key take away is that the derivative of a composite function can be calculated step by step, by first calculating the derivative of the most\ninner function, then the next inner function and so on. This is the key idea behind backpropagation as a neural network is just one big composite function with\nlots of variables and lots of inner functions. \nTODO: Multiple variables\n \nshow the idea. The chain rule. then the full derivation. \nShow nice reformulation of the loss function to make the calculation easier.", "type": "Document"} -{"id": null, "metadata": {"Header 1": "Backpropagation", "Header 2": "Backpropagation", "Header 3": "Vanishing and Exploding Gradients", "path": "../pages/digitalGarden/cs/machineLearning/backprop.mdx"}, "page_content": "The vanishing and exploding gradient problem is a problem that occurs when training very deep neural networks. It is caused by the chain rule and the fact that the\ngradient of the loss function is calculated by multiplying the gradients of each layer together. If the gradients are small, then multiplying them together will\nmake them even smaller. This is the vanishing gradient problem. If the gradients are large, then multiplying them together will make them even larger. This is the\nexploding gradient problem. \nThere are many possible solutions to this problem. Some of the most common are: \n- Different activation function such as ReLU or Leaky ReLU.\n- Batch Normalisation.\n- Residual connections, also known as skip connections. \nWe can see the vanishing gradient problem pretty easily by looking at the derivative of the sigmoid function. \n
\n