# UB-EDGE Stack

A server is existing in the playground, send me ([arthur.koch](mailto:arthur.koch@uniberg.com)) your public ssh key to get access

**10.31.1.36**  
**ub-edge-mgmt**  
4 Cores | 32 GB Memory | 50GB Storage

```
ssh your_username@10.31.1.36
# optional: add hostname to your hosts file on the macbook
sudo echo '10.31.1.36 ub-edge-mgmt' | sudo tee -a /etc/hosts > /dev/null
# optional: add alias to your shell (yolo mode)
echo 'alias ub-edge-mgmt="ssh your_username@10.31.1.36"' >> .*shrc*; source .*shrc*
```

It should be used to try things out, as a host for any CI/CD runner, building and serving containers, interacting with the edge server or even as a data source. please use your own user when possible.

# [](#disclaimer)disclaimer

this is just my point of view and you are free to comment, add, edit or delete anything you want.

more input = more inspiration = better ideas

# [](#required-for-the-ub-edge-hardware)required for the UB-EDGE hardware

- fast, reliable, customizable, modular and persistant "restart and it works again" operating system
- internet/intranet access out of the box **(most important feature imho)**
- network management for wifi/5g/ethernet/whatever
- Uplink/DHCP/DNS/hardware plug & play **not sure if dedicated hardware instead of the OS should manage some parts**
- a way of accessing the shell or graphical output without network functionality and without cables

# [](#very-nice-to-have)very nice to have

- docker + maybe GUI for management and on-site deployment -> we should containerize everything
- basic monitoring, log and data collection, alerting
- think about some useful connectors or usecases which can be accessed via browser with any device connected to the UB-EDGE network

# [](#%EF%B8%8F-some-inspiration-%EF%B8%8F)🧚‍♀️ some inspiration 🧚🏻‍♂️

## [](#bare-metal-deployment)Bare Metal Deployment

Some tools to provision bare metal. All I know is, that nobody likes foreman.

Hardware provisioning relies on old technology.

### [](#ironic)Ironic

[![ironic](https://docs.openstack.org/ironic/2023.1/_images/deployment_architecture_2.png)](https://docs.openstack.org/ironic/2023.1/_images/deployment_architecture_2.png)

> Ironic is an OpenStack project which provisions bare metal (as opposed to virtual) machines. It may be used independently or as part of an OpenStack Cloud, and integrates with the OpenStack Identity (keystone), Compute (nova), Network (neutron), Image (glance), and Object (swift) services.  
> The Bare Metal service manages hardware through both common (eg. PXE and IPMI) and vendor-specific remote management protocols. It provides the cloud operator with a unified interface to a heterogeneous fleet of servers while also providing the Compute service with an interface that allows physical servers to be managed as though they were virtual machines.

[https://docs.openstack.org/ironic/latest/](https://docs.openstack.org/ironic/latest/)

- sounds very promising
- standalone with very little overhead possible
- can be run as a standalone docker container
- in the openstack, which is good, because of integration into other applications
- canonical

In container: [https://docs.openstack.org/bifrost/latest/](https://docs.openstack.org/bifrost/latest/)

[https://docs.openstack.org/kolla-ansible/latest/reference/bare-metal/ironic-guide.html](https://docs.openstack.org/kolla-ansible/latest/reference/bare-metal/ironic-guide.html)

### [](#maas)MaaS

[![MaaS](/ub-edge/ub-edge-stack/-/raw/main/img/maas.png)](/ub-edge/ub-edge-stack/-/raw/main/img/maas.png)

> MAAS has a tiered architecture with a central postgres database backing a ‘Region Controller (regiond)’ that deals with operator requests. Distributed Rack Controllers (rackd) provide high-bandwidth services to multiple racks. The controller itself is stateless and horizontally scalable, presenting only a REST API.  
> Rack Controller (rackd) provides DHCP, IPMI, PXE, TFTP and other local services. They cache large items like operating system install images at the rack level for performance but maintain no exclusive state other than credentials to talk to the controller.

[https://maas.io/how-it-works](https://maas.io/how-it-works)

- sounds good
- not too many features, which is good
- is also mentioned in the openstack
- maybe too much overhead
- maintained by canonical

### [](#self-built-network-boot)self built network boot

- maybe the hardware is able to do it
- can be developed with only the features needed and no overhead like all the other software has

### [](#self-built-boot-from-disk)self built boot from disk

- boot persistent and preconfigured OS from robust storage device - use data from a second "normal" storage device
- create image, flash disk, plug in and you are ready
- how to manage and update?
- where to put it?
- maybe crazy idea

### [](#commercial-software)commercial software

- there are some products out there
- mostly foreman with a different hat on his head, different colour for the buttons and preconfigured for a specific ecosystem
- costs a alot of money and won't be easier than using open source tools

## [](#-configuration-deployment-)👨🏼‍🚀 Configuration Deployment 🚀

### [](#ansible)Ansible

> Ansible provides open-source automation that reduces complexity and runs everywhere. Using Ansible lets you automate virtually any task.  
> **Here are some common use cases for Ansible:**
> 
> - Eliminate repetition and simplify workflows
> - Manage and maintain system configuration
> - Continuously deploy complex software
> - Perform zero-downtime rolling updates
> 
> Ansible uses simple, human-readable scripts called playbooks to automate your tasks. You declare the desired state of a local or remote system in your playbook. Ansible ensures that the system remains in that state.  
>   
> **As automation technology, Ansible is designed around the following principles:**  
> **Agent-less architecture**  
> Low maintenance overhead by avoiding the installation of additional software across IT infrastructure.  
> **Simplicity**  
> Automation playbooks use straightforward YAML syntax for code that reads like documentation. Ansible is also decentralized, using SSH existing OS credentials to access to remote machines.  
> **Scalability and flexibility**  
> Easily and quickly scale the systems you automate through a modular design that supports a large range of operating systems, cloud platforms, and network devices.  
> **Idempotence and predictability**  
> When the system is in the state your playbook describes Ansible does not change anything, even if the playbook runs multiple times.  

[https://docs.ansible.com/](https://docs.ansible.com/)

- it's a python module
- tons of modules for every usecase
- very flexible approach
- amazing documentation & community support
- don't know a thing that can't be done with ansible on a running OS
- uses ssh only, no need for an agent
- very good harmony with baremetal and VM

### [](#docker-compose)docker compose

> Docker Compose is a tool for defining and running multi-container applications. It is the key to unlocking a streamlined and efficient development and deployment experience.  
> Compose simplifies the control of your entire application stack, making it easy to manage services, networks, and volumes in a single, comprehensible YAML configuration file. Then, with a single command, you create and start all the services from your configuration file.  
> Compose works in all environments; production, staging, development, testing, as well as CI workflows. It also has commands for managing the whole lifecycle of your application: Start, stop, and rebuild services. View the status of running services. Stream the log output of running services. Run a one-off command on a service

- when everything is containerized, we might only need one docker-compose.yml, which can be launched after the bare metal installation
- easy to modify variables
- easy to tailor it to our needs
- very good tool for a modular approach
- included in the docker package

### [](#terraform)Terraform

> HashiCorp Terraform is an infrastructure as code tool that lets you define both cloud and on-prem resources in human-readable configuration files that you can version, reuse, and share. You can then use a consistent workflow to provision and manage all of your infrastructure throughout its lifecycle. Terraform can manage low-level components like compute, storage, and networking resources, as well as high-level components like DNS entries and SaaS features.  
> Terraform creates and manages resources on cloud platforms and other services through their application programming interfaces (APIs). Providers enable Terraform to work with virtually any platform or service with an accessible API.

[https://www.terraform.io/](https://www.terraform.io/)

- we can use it for docker containers
- API approach
- config files look neat
- cloud/k8s first
- seen hints of it being able to also deploy to baremetal (proof needed)
- maybe very useful in later stages

## [](#-data-collection-and-monitoring-)🪣 Data Collection and Monitoring 🤯

### [](#prometheus)Prometheus

> Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company. To emphasize this, and to clarify the project's governance structure, Prometheus joined the Cloud Native Computing Foundation in 2016 as the second hosted project, after Kubernetes.  
> Prometheus collects and stores its metrics as time series data, i.e. metrics information is stored with the timestamp at which it was recorded, alongside optional key-value pairs called labels.  
> Source: [https://prometheus.io/docs/introduction/overview/](https://prometheus.io/docs/introduction/overview/)

[![Prometheus](https://prometheus.io/assets/architecture.png)](https://prometheus.io/assets/architecture.png) [https://prometheus.io/](https://prometheus.io/)

- made for time series data
- stores data with timestamp
- more of a pull than push application
- visualize data with good query language
- prometheus + grafana = beautiful and meanigful dashboards
- alert functionality
- easy to get into
- needs endpoint to pull (sad but ok)
- scalable

### [](#fluentd)fluentd

> Fluentd is an open-source data collector for unified logging layer. It allows you to unify the data collection and consumption for better use and understanding of data. Fluentd is designed to simplify the collection and consumption of log data by providing a unified logging layer that's scalable, flexible, and can be implemented with minimal effort.  
> Fluentd has a flexible plugin system that allows the community to extend its functionality. Its plugins connect to many types of data sources and outputs, allowing you to tailor Fluentd to specific needs.

[![fluentd](https://www.fluentd.org/images/fluentd-architecture.png)](https://www.fluentd.org/images/fluentd-architecture.png) [https://www.fluentd.org/](https://www.fluentd.org/)

- robust and scalable **data** unifier **|** collector **|** router **|** filter
- has only one job and is good at it
- very scalable
- does not store data
- can really collect anything and write almost everywhere

[![????](https://www.fluentd.org/images/blog/unified-logging-machine-to-machine.png)](https://www.fluentd.org/images/blog/unified-logging-machine-to-machine.png) tbh 2004 is closer today than it was 15 years ago

```
/(?:(?:^(?:\+|-){0,1}\d+\.$)|(?:^(?:\+|-){0,1}\.{0,1}\d+$)|(?:^(?:\+|-){0,1}\d+\.\d+$)|(?:^(?:\+|-){0,1}\.{0,1}\d+e(?:\+|-){0,1}\d+$)|(?:^(?:\+|-){0,1}\d+\.\d*e(?:\+|-){0,1}\d+$))/
```

### [](#grafana)Grafana

> Grafana is an open-source platform for monitoring and observability. It allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. It's most commonly used for visualizing time series data for infrastructure and application analytics, but many use it in other domains including industrial sensors, home automation, weather, and process control.  
> Grafana provides charts, graphs, and alerts for the web when connected to supported data sources. It is expandable through a plug-in system and can be integrated with cloud-based data sources like AWS Cloudwatch, Google Cloud Monitoring, Azure Monitor, and many others.

[![grafana](https://grafana.com/media/products/cloud/grafana/grafana-dashboard-english.png?w=1040)](https://grafana.com/media/products/cloud/grafana/grafana-dashboard-english.png?w=1040)

[https://grafana.com/](https://grafana.com/)

- easy to use
- highly customizable
- create dashboards full of insights for every usecase
- multiple datasources possible
- nice query language
- beautiful
- alerting function

### [](#influxdb-and-telegraf)InfluxDB and Telegraf

> InfluxDB is an open-source time series database developed by InfluxData. It is written in Go and optimized for fast, high-availability storage and retrieval of time series data in fields such as operations monitoring, application metrics, Internet of Things sensor data, and real-time analytics.  
> It also has support for processing data from Graphite. Telegraf is an open-source server agent to help you collect metrics from your stacks, sensors and systems. It's a piece of software that you can install anywhere in your infrastructure and it will read metrics from specified sources - it's plugin-driven for both the collection and the output of data. It supports a wide variety of data inputs and outputs. You can use it to collect metrics from third party APIs, or even listen for metrics via a StatsD and Kafka consumer services. It also has output plugins to send metrics to a variety of other datastores, services, and message queues, including InfluxDB, Graphite, OpenTSDB, Datadog, Librato, Kafka, MQTT, NSQ, and many others.

[https://www.influxdata.com/products/influxdb/](https://www.influxdata.com/products/influxdb/)

- influxdb = timeseries db
- likes to be used with telegraf which is not as flexible as fluentd, still flexible enough
- might be better suited - matter of taste
- no promQL but influxQL - matter of taste

### [](#monitoring-alternatives)monitoring alternatives

- nagios / checkmk
- zabbix
- PRTG
- selfbuilt

# [](#nice-to-know-for-later)nice to know for later

## [](#lxd)LXD

> LXD is an open-source hypervisor for Linux container management developed by Canonical, the company behind Ubuntu. It provides a system for virtualization that is designed to manage and orchestrate LXC (Linux Containers).  
> LXD allows for operations such as live migration, snapshotting, network management, storage management, and more on containers. It presents a REST API for remote management and a command-line interface (CLI) for local management of containers.  
> LXD containers are lightweight, secure, and perform like virtual machines, but have the speed and efficiency of process containers. This makes LXD suitable for running full system containers, essentially emulating virtual machines without the associated overhead.

[https://canonical.com/lxd](https://canonical.com/lxd)

## [](#nixos)NixOS

there are many people in the internet saying, that NixOS is a complete game changer. never met a person in real life that has used NixOS, most IT people haven't even heard about it - not sure what the truth is.

> In NixOS, the entire operating system — the kernel, applications, system packages, configuration files, and so on — is built by the Nix package manager from a description in a purely functional build language. The fact that it’s purely functional essentially means that building a new configuration cannot overwrite previous configurations. Most of the other features follow from this.

[https://nixos.org/](https://nixos.org/)

## [](#microcloud)MicroCloud

seems like an orchestrator for LXD, they talk a lot about edge computing

[![microcloud](https://res.cloudinary.com/canonical/image/fetch/f_auto,q_auto,fl_sanitize,w_1674,h_1032/https://assets.ubuntu.com/v1/e1c0038f-1.png)](https://res.cloudinary.com/canonical/image/fetch/f_auto,q_auto,fl_sanitize,w_1674,h_1032/https://assets.ubuntu.com/v1/e1c0038f-1.png)

[https://canonical.com/microcloud](https://canonical.com/microcloud)