Among the topics to be discussed in this course are: - XML and database technology (XML modeling, XML storage, XML query languages, XML processing) - NoSQL data management (Key value stores, MapReduce, triple stores, document stores, graph stores) - Content management (Enterprise content management, information retrieval, search technologies)
Among the topics to be discussed in this course are: - XML and database technology (XML modeling, XML storage, XML query languages, XML processing) - NoSQL data management (Key value stores, MapReduce, triple stores, document stores, graph stores) - Content management (Enterprise content management, information retrieval, search technologies)
The module is split into two lectures with the following specific content.
Distributed Systems Concepts and Architectures (winter term) • Architectures o Client/Server systems, naming, trading o Structured and unstructured peer-to-peer systems o Multi-tier systems o Edge cloud, mobile and pervasive computing systems • System software and paradigms o Interaction and data representation o Remote Procedure Calls (RPC) and Remote Method In-vocation (RMI) o Distributed shared memory o Event-based and publish/subscribe communication
Distributed Systems Algorithms (summer term) • Wave and information propagation algorithms • Termination detection • Fault tolerance in distributed systems • State machine replication • Synchronization and deadlocks
The module is split into two lectures with the following specific content.
Distributed Systems Concepts and Architectures (winter term) • Architectures o Client/Server systems, naming, trading o Structured and unstructured peer-to-peer systems o Multi-tier systems o Edge cloud, mobile and pervasive computing systems • System software and paradigms o Interaction and data representation o Remote Procedure Calls (RPC) and Remote Method In-vocation (RMI) o Distributed shared memory o Event-based and publish/subscribe communication
Distributed Systems Algorithms (summer term) • Wave and information propagation algorithms • Termination detection • Fault tolerance in distributed systems • State machine replication • Synchronization and deadlocks
Software-defined networking (SDN) is a popular technology to increase the flexibility and efficiency of communication networks that caught a lot of attention and support from academia and industry (e.g., Google, Microsoft, NEC, Cisco, VMWare).SDN increases flexibility and efficiency by out-sourcing network control (e.g., routing) from network elements (switches) to a network controller running on standard servers. The network controller has a global view onto the network (logical centralization) and facilitates the implementation of network control logic “in software” using standard languages like C, Python, Java, or even declarative approaches.With OpenFlow, a first standard protocol for SDN is available that is implemented by hardware switches from several vendors and also software switches like Open vSwitch as typically used in data centers to connect virtual machines on hosts. Moreover, several open source controller implementations exist.
Software-defined networking (SDN) is a popular technology to increase the flexibility and efficiency of communication networks that caught a lot of attention and support from academia and industry (e.g., Google, Microsoft, NEC, Cisco, VMWare).SDN increases flexibility and efficiency by out-sourcing network control (e.g., routing) from network elements (switches) to a network controller running on standard servers. The network controller has a global view onto the network (logical centralization) and facilitates the implementation of network control logic “in software” using standard languages like C, Python, Java, or even declarative approaches.With OpenFlow, a first standard protocol for SDN is available that is implemented by hardware switches from several vendors and also software switches like Open vSwitch as typically used in data centers to connect virtual machines on hosts. Moreover, several open source controller implementations exist.
Many datasets and scientific problems are posed as high-dimensional problems, mapping from many inputs to target values. Learning from image data is a classic example that can map hundreds of thousands of colour values (pixel data) to the class of an object or to physical values such as temperature or energy, and simulations often depend on many input parameters. If the problems were really as high-dimensional as they are formulated, conventional algorithms would have to fail. Fortunately, the “real” dimensionality is usually much lower. But we have to identify and handle this relevant low-dimensional structure to be able to approximate/learn the underlying problem and to quantify the relevant dependencies. In this course:
• We will learn the properties of high-dimensional problems, with surprising implications.
• We will examine algorithms to analyse high-dimensional data, identify important parameters or combinations thereof (dimensional analysis, dimensionality reduction).
• We will learn how to approximate high-dimensional problems so that we can analyse them and predict in a lower-dimensional representation.
• We will understand the limitations of representative algorithms that can be suited for high-dimensional tasks, and we will implement and use them.
This course will bridge theory to practice, implementing and using selected algorithms for real-world data using Python.
Many datasets and scientific problems are posed as high-dimensional problems, mapping from many inputs to target values. Learning from image data is a classic example that can map hundreds of thousands of colour values (pixel data) to the class of an object or to physical values such as temperature or energy, and simulations often depend on many input parameters. If the problems were really as high-dimensional as they are formulated, conventional algorithms would have to fail. Fortunately, the “real” dimensionality is usually much lower. But we have to identify and handle this relevant low-dimensional structure to be able to approximate/learn the underlying problem and to quantify the relevant dependencies. In this course:
• We will learn the properties of high-dimensional problems, with surprising implications.
• We will examine algorithms to analyse high-dimensional data, identify important parameters or combinations thereof (dimensional analysis, dimensionality reduction).
• We will learn how to approximate high-dimensional problems so that we can analyse them and predict in a lower-dimensional representation.
• We will understand the limitations of representative algorithms that can be suited for high-dimensional tasks, and we will implement and use them.
This course will bridge theory to practice, implementing and using selected algorithms for real-world data using Python.
Fundamentals of wireless data transmission Media access for wireless networks Location Management Wireless wide-area networks and mobile communication systems (GSM, GPRS, UMTS) Wireless local-area and personal area networks: IEEE 802.11, Bluetooth Ad-hoc Networks: routing protocols and algorithms Mobility in IP-networks: Mobile IP Transport layer protocols for mobile systems Mobile data management concepts Android programming
Fundamentals of wireless data transmission Media access for wireless networks Location Management Wireless wide-area networks and mobile communication systems (GSM, GPRS, UMTS) Wireless local-area and personal area networks: IEEE 802.11, Bluetooth Ad-hoc Networks: routing protocols and algorithms Mobility in IP-networks: Mobile IP Transport layer protocols for mobile systems Mobile data management concepts Android programming
Driven by the requirements of innovative applications and services, new technologies for networked and distributed systems have emerged in recent years. In the Internet of Things, a large number of devices and everyday objects are equipped with sensors and actuators that are networked via mostly wireless communication technologies (e.g. BLE, ZigBee, 6LoWPAN, LoRaWAN) and connected to the Internet. A corresponding trend can be found in the Industrial Internet of Things (IIoT, Industry 4.0), where machines, tools, logistics, etc. are connected. Flexibility and efficiency of distributed systems are being advanced through virtualisation (e.g. NFV) and software-defined systems (e.g. SDN), which enable flexible adaptation and dynamic scaling. Driven by the hype around Bitcoin, distributed ledger technologies (e.g. blockchain) and advanced concepts such as smart contracts have been developed. They not only serve as the basis for electronic currencies (e.g. Bitcoin, Ethereum), but also support other applications in which a consensus between parties must be reached and documented. Another challenge is to reduce latencies, for example by using nearby edge and fog resources in addition to remote cloud resources or by implementing optimised communication protocols that, among other things, enable a fast connection between client and server. Mobile (5G) communication technologies and systems have developed rapidly. For example, the Covid-19 tracking application uses mobile end user devices for contact tracing. This method, known as crowdsensing, can generally be used to collect large amounts of geographically distributed sensor data. In addition to these aspects, this seminar will discuss a wide range of current Internet technologies, protocols and standards that enable networked and distributed applications and services. Other possible topics include machine-to-machine communication (M2M), OPC-UA (Unified Architecture), real-time communication (Real-Time Ethernet), WWW technologies and protocols (HTTP 2.0/SPDY), and new transport protocols (QUIC, Multipath-TCP).
Driven by the requirements of innovative applications and services, new technologies for networked and distributed systems have emerged in recent years. In the Internet of Things, a large number of devices and everyday objects are equipped with sensors and actuators that are networked via mostly wireless communication technologies (e.g. BLE, ZigBee, 6LoWPAN, LoRaWAN) and connected to the Internet. A corresponding trend can be found in the Industrial Internet of Things (IIoT, Industry 4.0), where machines, tools, logistics, etc. are connected. Flexibility and efficiency of distributed systems are being advanced through virtualisation (e.g. NFV) and software-defined systems (e.g. SDN), which enable flexible adaptation and dynamic scaling. Driven by the hype around Bitcoin, distributed ledger technologies (e.g. blockchain) and advanced concepts such as smart contracts have been developed. They not only serve as the basis for electronic currencies (e.g. Bitcoin, Ethereum), but also support other applications in which a consensus between parties must be reached and documented. Another challenge is to reduce latencies, for example by using nearby edge and fog resources in addition to remote cloud resources or by implementing optimised communication protocols that, among other things, enable a fast connection between client and server. Mobile (5G) communication technologies and systems have developed rapidly. For example, the Covid-19 tracking application uses mobile end user devices for contact tracing. This method, known as crowdsensing, can generally be used to collect large amounts of geographically distributed sensor data. In addition to these aspects, this seminar will discuss a wide range of current Internet technologies, protocols and standards that enable networked and distributed applications and services. Other possible topics include machine-to-machine communication (M2M), OPC-UA (Unified Architecture), real-time communication (Real-Time Ethernet), WWW technologies and protocols (HTTP 2.0/SPDY), and new transport protocols (QUIC, Multipath-TCP).
Practical Course Information Systems: Data-intensive Computing
Semester:
SS 25
Inhalt:
This practical course mainly focuses on data-intensive applications and how they process, analyze and store their data. Nowadays, the amount of data in applications is growing intensely. This data can be either structured, e.g., stored in relational database systems such as MySQL, or unstructured, e.g., stored in text documents, pictures or other media files. Deriving information and, as a consequence, knowledge from this data is a huge challenge. To cope with this issue, new data analytics and data storage technologies have been developed that focus on the one hand on deriving information from this large amount of structured and unstructured data (e.g., data mining, natural language processing, ...) and on the other hand on efficient data storage (e.g., NoSQL databases, column-stores, ...). Modern web technologies are used to provide the web frontend and to visualize huge data sets and analytic results. Furthermore, to enable fast, efficient application development as well as a high degree of accessibility, flexibility and scalability, cloud computing platforms are oftentimes used to implement such applications.
In this practical course, you will work in small teams. Each team will implement a new or extend an existing data-intensive application based on tasks that typically involve data analysis, data visualization, data storage and efficient data processing. The used infrastructure, e.g. a cloud-based infrastructure like IBM Cloud, depends on the application.
Practical Course Information Systems: Data-intensive Computing
Semester:
SS 25
Inhalt:
This practical course mainly focuses on data-intensive applications and how they process, analyze and store their data. Nowadays, the amount of data in applications is growing intensely. This data can be either structured, e.g., stored in relational database systems such as MySQL, or unstructured, e.g., stored in text documents, pictures or other media files. Deriving information and, as a consequence, knowledge from this data is a huge challenge. To cope with this issue, new data analytics and data storage technologies have been developed that focus on the one hand on deriving information from this large amount of structured and unstructured data (e.g., data mining, natural language processing, ...) and on the other hand on efficient data storage (e.g., NoSQL databases, column-stores, ...). Modern web technologies are used to provide the web frontend and to visualize huge data sets and analytic results. Furthermore, to enable fast, efficient application development as well as a high degree of accessibility, flexibility and scalability, cloud computing platforms are oftentimes used to implement such applications.
In this practical course, you will work in small teams. Each team will implement a new or extend an existing data-intensive application based on tasks that typically involve data analysis, data visualization, data storage and efficient data processing. The used infrastructure, e.g. a cloud-based infrastructure like IBM Cloud, depends on the application.
Das Programmierprojekt soll eine Einführung in die Welt des wissenschaftlichen Rechnens (Scientific Computing) und der Simulationen bieten. Dabei soll das Aufeinandertreffen zweier Galaxien mit Hilfe einer N-Körper Simulation umgesetzt werden.
Das Projekt soll in 3 Phasen unterteilt werden: 1. Kennenlernen des Problems und eine erste Implementierung des N-Körper Problems mittels des naiven brute-force Algorithmus. 2. Erweitern der Implementierung um den effizienteren, baum-basierten Barnes-Hut Algorithmus. 3. Performance Evaluierung und Wettbewerb.
Das Programmierprojekt soll eine Einführung in die Welt des wissenschaftlichen Rechnens (Scientific Computing) und der Simulationen bieten. Dabei soll das Aufeinandertreffen zweier Galaxien mit Hilfe einer N-Körper Simulation umgesetzt werden.
Das Projekt soll in 3 Phasen unterteilt werden: 1. Kennenlernen des Problems und eine erste Implementierung des N-Körper Problems mittels des naiven brute-force Algorithmus. 2. Erweitern der Implementierung um den effizienteren, baum-basierten Barnes-Hut Algorithmus. 3. Performance Evaluierung und Wettbewerb.
Dieses Seminar befasst sich mit der (mathematischen) Modellbildung, also der Frage, wie wir Sachverhalte mittels der Methodik der Mathematik beschreiben können und sie dadurch für eine Simulation mit dem Rechner erschließen. Denn simuliert werden muss überall, wo Experimente zu teuer oder nicht vertretbar sind. Und das ist fast überall: ob bei Crash-Tests, Blutfluss im Herzen, dem Straßenverkehr in Stuttgart oder der Ausbreitung von Pandemien... In diesem Seminar gibt es zum einen Themen, die die Statistik-und-Stochastik- und die Numerik-Vorlesungen ergänzen, zum anderen aber auch solche, die schon in Richtung der Simulation komplizierter Probleme auf dem Rechner (der hier durchaus ein Supercomputer sein kann) gehen. Die Themen sind vielseitig und reichen von abstrakten hochdimensionalen Kugeln bis hin zur konkreten Modellierung von Pandemien.
Dieses Seminar befasst sich mit der (mathematischen) Modellbildung, also der Frage, wie wir Sachverhalte mittels der Methodik der Mathematik beschreiben können und sie dadurch für eine Simulation mit dem Rechner erschließen. Denn simuliert werden muss überall, wo Experimente zu teuer oder nicht vertretbar sind. Und das ist fast überall: ob bei Crash-Tests, Blutfluss im Herzen, dem Straßenverkehr in Stuttgart oder der Ausbreitung von Pandemien... In diesem Seminar gibt es zum einen Themen, die die Statistik-und-Stochastik- und die Numerik-Vorlesungen ergänzen, zum anderen aber auch solche, die schon in Richtung der Simulation komplizierter Probleme auf dem Rechner (der hier durchaus ein Supercomputer sein kann) gehen. Die Themen sind vielseitig und reichen von abstrakten hochdimensionalen Kugeln bis hin zur konkreten Modellierung von Pandemien.
The Internet of Everything (IoE), where virtually everything can now communicate through the Internet, and the increasingly demanding performance requirements of new technologies (e.g., cryptocurrencies) have driven the emergence of new computing paradigms for distributed systems. Scalability is now offered not only by centralized cloud providers, but also by edge computing systems, where geographically distributed servers provide computational resources at the edge of the network and, therefore, close to the end devices. This can significantly reduce latency for time-critical applications like vehicular networks. The advances in edge computing have led to the emergence of edge AI, where powerful AI algorithms are deployed at the edge, without relying on a remote cloud. But distributed systems come with many challenges which requires a profound understanding of core principles in distributed computing. As pointed out by former Google Senior Vice President Urs Hölzl: “At scale, everything breaks ... Keeping things simple and yet scalable is actually the biggest challenge. It's really, really hard.“ This is especially true for dynamic and uncertain environments that we are facing, for instance, in smart buildings or smart energy systems. Self-adaptation is one of the key mechanisms for coping with increasingly large and dynamic systems, often by using machine learning techniques (GNN, reinforcement learning). Challenges that come with distributed storage systems include consistency and scalability. Another hot topic, especially in the context of 5G and the development of future 6G networks, is Time Sensitive Networking (TSN), which defines a set of standards to enable reliable, deterministic real-time communication in Ethernet networks. These standards target, among others, time synchronization and traffic shaping/scheduling approaches for both event-based and time-triggered traffic. In this seminar, we take a deep dive into specific concepts of distributed and context-aware systems that tackle the above challenges. The topics will be published on the department’s website and are assigned according to a standardized procedure as explained during the kick-off.
The Internet of Everything (IoE), where virtually everything can now communicate through the Internet, and the increasingly demanding performance requirements of new technologies (e.g., cryptocurrencies) have driven the emergence of new computing paradigms for distributed systems. Scalability is now offered not only by centralized cloud providers, but also by edge computing systems, where geographically distributed servers provide computational resources at the edge of the network and, therefore, close to the end devices. This can significantly reduce latency for time-critical applications like vehicular networks. The advances in edge computing have led to the emergence of edge AI, where powerful AI algorithms are deployed at the edge, without relying on a remote cloud. But distributed systems come with many challenges which requires a profound understanding of core principles in distributed computing. As pointed out by former Google Senior Vice President Urs Hölzl: “At scale, everything breaks ... Keeping things simple and yet scalable is actually the biggest challenge. It's really, really hard.“ This is especially true for dynamic and uncertain environments that we are facing, for instance, in smart buildings or smart energy systems. Self-adaptation is one of the key mechanisms for coping with increasingly large and dynamic systems, often by using machine learning techniques (GNN, reinforcement learning). Challenges that come with distributed storage systems include consistency and scalability. Another hot topic, especially in the context of 5G and the development of future 6G networks, is Time Sensitive Networking (TSN), which defines a set of standards to enable reliable, deterministic real-time communication in Ethernet networks. These standards target, among others, time synchronization and traffic shaping/scheduling approaches for both event-based and time-triggered traffic. In this seminar, we take a deep dive into specific concepts of distributed and context-aware systems that tackle the above challenges. The topics will be published on the department’s website and are assigned according to a standardized procedure as explained during the kick-off.