Day 85 - Data Services

2025-07-06 00:10:32 +07:00 · 2022-03-30 10:31:55 +01:00
parent a8ae8e2d64
commit e42a6577ae
15 changed files with 177 additions and 10 deletions
--- a/Days/Images/Day84_Monitoring1.png
+++ b/Days/Images/Day84_Monitoring1.png
--- a/Days/Images/Day85_Data1.png
+++ b/Days/Images/Day85_Data1.png
--- a/Days/Images/Day85_Data2.png
+++ b/Days/Images/Day85_Data2.png
--- a/Days/Images/Day85_Data3.png
+++ b/Days/Images/Day85_Data3.png
--- a/Days/Images/Day85_Data4.png
+++ b/Days/Images/Day85_Data4.png
--- a/Days/day79.md
+++ b/Days/day79.md
@ -12,14 +12,19 @@ This is the essence of a good log aggregation platform efficiently collect logs
 ### Example App 
-Our example application is a web browser, we have a typical front end and backend storing our critical data to a MongoDB database. 
+Our example application is a web app, we have a typical front end and backend storing our critical data to a MongoDB database. 
 If a user told us the page turned all white and printed an error message we would be hard-pressed to diagnose the problem with our current stack the user would need to manually send us the error and we'd need to match it with relevant logs in the other three services. 
 ### ELK 
-Let's take a look at elk a popular open source log aggregation stack named after its three components elasticsearch logstash and kibana if we installed it in our example app we'd get three new services so the user's web browser again would connect to our front end and back end the back end would connect to and all of these services the browser the front end the back end and would all send logs to logstash and then the way that these three components work the components of elk elasticsearch logstash and Kibana is that all of the other services send logs to logstash, logstash takes these logs which are text emitted by the application for example the the web browser when you visit a web page, the web page might log this visitor access this page at this time and that's an example of a log message those logs would be sent to logstash.
+Let's take a look at ELK, a popular open source log aggregation stack named after its three components elasticsearch, logstash and kibana if we installed it in the same environment as our example app. 
 The web application would connect to the frontend which then connects to the backend, the backend would send logs to logstash and then the way that these three components work 
 ### The components of elk 
 Elasticsearch, logstash and Kibana is that all of  services send logs to logstash, logstash takes these logs which are text emitted by the application. For example the web application when you visit a web page, the web page might log this visitor access to this page at this time and that's an example of a log message those logs would be sent to logstash.
 Logstash would then extract things from them so for that log message user did **thing**, at **time**. It would extract the time and extract the message and extract the user and include those all as tags so the message would be an object of tags and message so that you could search them easily you could find all of the requests made by a specific user but logstash doesn't store things itself it stores things in elasticsearch which is a efficient database for querying text and elasticsearch exposes the results as Kibana and Kibana is a web server that connects to elasticsearch and allows administrators as the devops person or other people on your team, the on-call engineer to view the logs in production whenever there's a major fault. You as the administrator would connect to Kibana, Kibana would query elasticsearch for logs matching whatever you wanted. 
--- a/Days/day83.md
+++ b/Days/day83.md
@ -2,7 +2,7 @@
 We saw a lot of Kibana over this section around Observability. But we have to also take some time to cover Grafana. But also they are not the same and they are not completely competing against each other. 
-Kibana’s core feature is data querying and analysis. Using various methods, users can search the data indexed in Elasticsearch for specific events or strings within their data for root cause analysis and diagnostics. Based on these queries, users can use Kibana’s visualidation features which allow users to visualize data in a variety of different ways, using charts, tables, geographical maps and other types of visualizations.
+Kibana’s core feature is data querying and analysis. Using various methods, users can search the data indexed in Elasticsearch for specific events or strings within their data for root cause analysis and diagnostics. Based on these queries, users can use Kibana’s visualisation features which allow users to visualize data in a variety of different ways, using charts, tables, geographical maps and other types of visualizations.
 Grafana actually started as a fork of Kibana, Grafana had an aim to supply support for metrics aka monitoring, which at that time Kibana did not provide. 
--- a/Days/day84.md
+++ b/Days/day84.md
@ -1,6 +1,6 @@
 ## The Big Picture: Data Management
-![](Images/Day84_Monitoring1.png)
+![](Images/Day84_Data1.png)
 Data Management is by no means a new wall to climb, although we do know that data is more important than it maybe was a few years ago. Valuable and ever changing it can also be a massive nightmare when we are talking about automation and continuously integrate, test and deploy frequent software releases. Enter the persistent data and underlying data services often the main culprit when things go wrong. 
--- a/Days/day85.md
+++ b/Days/day85.md
@ -0,0 +1,140 @@
 ## Data Services
 Databases are going to be the most common data service that we come across in our environments. I wanted to take this session to explore some of those different types of Databases and some of the use cases they each have. Some we have used and seen throughout the course of the challenge. 
 From an application development point of view choosing the right data service or database is going to be a huge decision when it comes to the performance and scalability of your application. 
 https://www.youtube.com/watch?v=W2Z7fbCLSTw
 ### Key-value
 A key-value database is a type of nonrelational database that uses a simple key-value method to store data. A key-value database stores data as a collection of key-value pairs in which a key serves as a unique identifier. Both keys and values can be anything, ranging from simple objects to complex compound objects. Key-value databases are highly partitionable and allow horizontal scaling at scales that other types of databases cannot achieve.
 An example of a Key-Value database is Redis. 
 *Redis is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, such as strings, lists, maps, sets, sorted sets, HyperLogLogs, bitmaps, streams, and spatial indices.*
 ![](Images/Day85_Data1.png)
 As you can see from the description of Redis this means that our database is fast but we are limited on space as a trade off. Also no queries or joins which means data modelling options are very limited. 
 Best for: 
 - Caching 
 - Pub/Sub
 - Leaderboards 
 - Shopping carts
 Generally used as a cache above another persistent data layer. 
 ### Wide Column
 A wide-column database is a NoSQL database that organises data storage into flexible columns that can be spread across multiple servers or database nodes, using multi-dimensional mapping to reference data by column, row, and timestamp.
 *Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.*
 ![](Images/Day85_Data2.png)
 No schema which means can handle unstructured data however this can be seen as a benefit to some workloads. 
 Best for: 
 - Time-Series 
 - Historical Records 
 - High-Write, Low-Read 
 ### Document
 A document database (also known as a document-oriented database or a document store) is a database that stores information in documents. 
 *MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Server Side Public License.*
 ![](Images/Day85_Data3.png)
 NoSQL document databases allow businesses to store simple data without using complex SQL codes. Quickly store with no compromise to reliability. 
 Best for: 
 - Most Applications 
 - Games 
 - Internet of Things 
 ### Relational
 If you are new to databases but you know of them my guess is that you have absolutely come across a relational database. 
 A relational database is a digital database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system. Many relational database systems have an option of using the SQL for querying and maintaining the database.
 *MySQL is an open-source relational database management system. Its name is a combination of "My", the name of co-founder Michael Widenius's daughter, and "SQL", the abbreviation for Structured Query Language.*
 MySQL is one example of a relational database there are lots of other options. 
 ![](Images/Day85_Data4.png)
 Whilst researching relational databases the term or abbreviation **ACID** has been mentioned a lot, (atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequence of database operations that satisfies the ACID properties (which can be perceived as a single logical operation on the data) is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction. 
 Best for: 
 - Most Applications (It has been around for years, doesn't mean it is the best)
 It is not ideal for unstructured data or the ability to scale is where some of the other NoSQL mentions give a better ability to scale for certain workloads. 
 ### Graph
 A graph database stores nodes and relationships instead of tables, or documents. Data is stored just like you might sketch ideas on a whiteboard. Your data is stored without restricting it to a pre-defined model, allowing a very flexible way of thinking about and using it.
 *Neo4j is a graph database management system developed by Neo4j, Inc. Described by its developers as an ACID-compliant transactional database with native graph storage and processing*
 Best for: 
 - Graphs
 - Knowledge Graphs
 - Recommendation Engines
 ### Search Engine
 In the last section we actually used a Search Engine database in the way of Elasticsearch. 
 A search-engine database is a type of non-relational database that is dedicated to the search of data content. Search-engine databases use indexes to categorise the similar characteristics among data and facilitate search capability.
 *Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.*
 Best for: 
 - Search Engines 
 - Typeahead 
 - Log search
 ### Multi-model
 A multi-model database is a database management system designed to support multiple data models against a single, integrated backend. In contrast, most database management systems are organized around a single data model that determines how data can be organized, stored, and manipulated.Document, graph, relational, and key–value models are examples of data models that may be supported by a multi-model database. 
 *Fauna is a flexible, developer-friendly, transactional database delivered as a secure and scalable cloud API with native GraphQL.*
 Best for: 
 - You are not stuck to having to choose a data model
 - ACID Compliant
 - Fast 
 - No provisioning overhead
 - How do you want to consume your data and let the cloud do the heavy lifting
 That is going to wrap up this database overview session, no matter what industry you are in you are going to come across one area of databases. We are then going to take some of these examples and look at the data management and in particular the protection and storing of these data services later on in the section. 
 There are a ton of resources I have linked below, you could honestly spend 90 years probably deep diving into all database types and everything that comes with this. 
 ## Resources 
 - [Redis Crash Course - the What, Why and How to use Redis as your primary database](https://www.youtube.com/watch?v=OqCK95AS-YE)
 - [Redis: How to setup a cluster - for beginners](https://www.youtube.com/watch?v=GEg7s3i6Jak)
 - [Redis on Kubernetes for beginners](https://www.youtube.com/watch?v=JmCn7k0PlV4)
 - [Intro to Cassandra - Cassandra Fundamentals](https://www.youtube.com/watch?v=YjYWsN1vek8)
 - [MongoDB Crash Course](https://www.youtube.com/watch?v=ofme2o29ngU)
 - [MongoDB in 100 Seconds](https://www.youtube.com/watch?v=-bt_y4Loofg)
 - [What is a Relational Database?](https://www.youtube.com/watch?v=OqjJjpjDRLc)
 - [Learn PostgreSQL Tutorial - Full Course for Beginners](https://www.youtube.com/watch?v=qw--VYLpxG4)
 - [MySQL Tutorial for Beginners [Full Course]](https://www.youtube.com/watch?v=7S_tz1z_5bA)
 - [What is a graph database? (in 10 minutes)](https://www.youtube.com/watch?v=REVkXVxvMQE)
 - [What is Elasticsearch?](https://www.youtube.com/watch?v=ZP0NmfyfsoM)
 - [FaunaDB Basics - The Database of your Dreams](https://www.youtube.com/watch?v=2CipVwISumA)
 - [Fauna Crash Course - Covering the Basics](https://www.youtube.com/watch?v=ihaB7CqJju0)
 See you on [Day 86](day86.md)
--- a/Days/day86.md
+++ b/Days/day86.md
@ -0,0 +1,4 @@
 ## Backup all the platforms
 Talk about Physical, Virtual, Cloud, Kubernetes as different platforms. 
--- a/Days/day87.md
+++ b/Days/day87.md
@ -0,0 +1,5 @@
 ## Hands-On Backup & Recovery
 Kopia walkthrough and overview on desktop machine to Object Storage
 Kasten K10 deployment and protect one of those data services 
--- a/Days/day88.md
+++ b/Days/day88.md
@ -0,0 +1,3 @@
 ## Application Focused Backups
 This should focus on Kanister but maybe K10 as well towards the end. 
--- a/Days/day89.md
+++ b/Days/day89.md
@ -0,0 +1,5 @@
 ## Disaster Recovery
 Fire, Flood & Blood but now also ransomware! 
 K10 could be the focus here. 
--- a/Days/day90.md
+++ b/Days/day90.md
@ -0,0 +1,5 @@
 ## Data & Application Mobility
 Maybe we have been working on the same cluster since the start of this section and we have deployed several data services and applications to our cluster but now we have the requirement to move to a new cluster. 
 A new Minikube cluster, this could create a new cluster and then use Kasten K10 to move it over to the new
--- a/README.md
+++ b/README.md
@ -133,9 +133,9 @@ This will not cover all things DevOps but it will cover the areas that I feel wi
 ### Store & Protect Your Data
 - [✔️] 🗃️ 84 > [The Big Picture: Data Management](Days/day84.md)
- [🚧] 🗃️ 85 > [](Days/day85.md)
+- [🚧] 🗃️ 85 > [Data Services](Days/day85.md)
- [] 🗃️ 86 > [](Days/day86.md)
+- [] 🗃️ 86 > [Backup all the platforms](Days/day86.md)
- [] 🗃️ 87 > [](Days/day87.md)
+- [] 🗃️ 87 > [Hands-On Backup & Recovery](Days/day87.md)
- [] 🗃️ 88 > [](Days/day88.md)
+- [] 🗃️ 88 > [Application Focused Backups](Days/day88.md)
- [] 🗃️ 89 > [](Days/day89.md)
+- [] 🗃️ 89 > [Disaster Recovery](Days/day89.md)
- [] 🗃️ 90 > [](Days/day90.md)
+- [] 🗃️ 90 > [Data & Application Mobility](Days/day90.md)
		`@ -0,0 +1,4 @@`
							`## Backup all the platforms`

							`Talk about Physical, Virtual, Cloud, Kubernetes as different platforms.`
		`@ -0,0 +1,3 @@`
							`## Application Focused Backups`

							`This should focus on Kanister but maybe K10 as well towards the end.`