Die Vorlesung behandelt aktuelle Modellierungs-, Entwicklungs- und Verarbeitungskonzepte DB-basierter Anwendungssysteme. Hierzu gehört insbesondere die Verarbeitung strukturierter, semi-strukturierter und unstrukturierter Daten sowie Techniken zur Integration von Informationen aus heterogenen Datenquellen. Unter anderem werden folgende Themenbereiche vertieft: - XML und Datenbanktechnologie (XML-Modellierung, XML-Speicherung, XML-Anfragesprachen, - XML-Verarbeitung) - Content Management (Enterprise Content Management, Information Retrieval, Suchtechnologien) - NoSQL Datenmanagement (Key value stores, MapReduce, triple stores, document stores, graph stores)
Die Vorlesung behandelt aktuelle Modellierungs-, Entwicklungs- und Verarbeitungskonzepte DB-basierter Anwendungssysteme. Hierzu gehört insbesondere die Verarbeitung strukturierter, semi-strukturierter und unstrukturierter Daten sowie Techniken zur Integration von Informationen aus heterogenen Datenquellen. Unter anderem werden folgende Themenbereiche vertieft: - XML und Datenbanktechnologie (XML-Modellierung, XML-Speicherung, XML-Anfragesprachen, - XML-Verarbeitung) - Content Management (Enterprise Content Management, Information Retrieval, Suchtechnologien) - NoSQL Datenmanagement (Key value stores, MapReduce, triple stores, document stores, graph stores)
• Wave and information propagation algorithms• Termination detection• Fault tolerance in distributed systems• State machine replication• Synchronization and deadlocks
• Wave and information propagation algorithms• Termination detection• Fault tolerance in distributed systems• State machine replication• Synchronization and deadlocks
Software-defined networking (SDN) is a popular technology to increase the flexibility and efficiency of communication networks that caught a lot of attention and support from academia and industry (e.g., Google, Microsoft, NEC, Cisco, VMWare).SDN increases flexibility and efficiency by out-sourcing network control (e.g., routing) from network elements (switches) to a network controller running on standard servers. The network controller has a global view onto the network (logical centralization) and facilitates the implementation of network control logic “in software” using standard languages like C, Python, Java, or even declarative approaches.With OpenFlow, a first standard protocol for SDN is available that is implemented by hardware switches from several vendors and also software switches like Open vSwitch as typically used in data centers to connect virtual machines on hosts. Moreover, several open source controller implementations exist.
Software-defined networking (SDN) is a popular technology to increase the flexibility and efficiency of communication networks that caught a lot of attention and support from academia and industry (e.g., Google, Microsoft, NEC, Cisco, VMWare).SDN increases flexibility and efficiency by out-sourcing network control (e.g., routing) from network elements (switches) to a network controller running on standard servers. The network controller has a global view onto the network (logical centralization) and facilitates the implementation of network control logic “in software” using standard languages like C, Python, Java, or even declarative approaches.With OpenFlow, a first standard protocol for SDN is available that is implemented by hardware switches from several vendors and also software switches like Open vSwitch as typically used in data centers to connect virtual machines on hosts. Moreover, several open source controller implementations exist.
Many datasets and scientific problems are posed as high-dimensional problems, mapping from many inputs to target values. Learning from image data is a classic example that can map hundreds of thousands of colour values (pixel data) to the class of an object or to physical values such as temperature or energy, and simulations often depend on many input parameters. If the problems were really as high-dimensional as they are formulated, conventional algorithms would have to fail. Fortunately, the “real” dimensionality is usually much lower. But we have to identify and handle this relevant low-dimensional structure to be able to approximate/learn the underlying problem and to quantify the relevant dependencies. In this course:
• We will learn the properties of high-dimensional problems, with surprising implications.
• We will examine algorithms to analyse high-dimensional data, identify important parameters or combinations thereof (dimensional analysis, dimensionality reduction).
• We will learn how to approximate high-dimensional problems so that we can analyse them and predict in a lower-dimensional representation.
• We will understand the limitations of representative algorithms that can be suited for high-dimensional tasks, and we will implement and use them.
This course will bridge theory to practice, implementing and using selected algorithms for real-world data using Python.
Many datasets and scientific problems are posed as high-dimensional problems, mapping from many inputs to target values. Learning from image data is a classic example that can map hundreds of thousands of colour values (pixel data) to the class of an object or to physical values such as temperature or energy, and simulations often depend on many input parameters. If the problems were really as high-dimensional as they are formulated, conventional algorithms would have to fail. Fortunately, the “real” dimensionality is usually much lower. But we have to identify and handle this relevant low-dimensional structure to be able to approximate/learn the underlying problem and to quantify the relevant dependencies. In this course:
• We will learn the properties of high-dimensional problems, with surprising implications.
• We will examine algorithms to analyse high-dimensional data, identify important parameters or combinations thereof (dimensional analysis, dimensionality reduction).
• We will learn how to approximate high-dimensional problems so that we can analyse them and predict in a lower-dimensional representation.
• We will understand the limitations of representative algorithms that can be suited for high-dimensional tasks, and we will implement and use them.
This course will bridge theory to practice, implementing and using selected algorithms for real-world data using Python.
Diese Vorlesung bietet eine Einführung in die Grundlagen der Modellbildung und Simulation mit dem Ziel der Vorbereitung auf weiterführende Vorlesungen in diesem Bereich. Da Simulationsmethoden oft für viele verschiedene Problemklassen einsetzbar sind, ist die Vorlesung methodisch strukturiert. Den Hauptteil der Vorlesung bilden hierbei diskrete Modelle sowie deren Behandlung, aber auch kontinuierliche Modelle werden ergänzend gestreift. Ob diskrete Ereignissimulation, spieltheoretische Ansätze, Zelluläre Automaten, Räuber-Beute Modelle oder Fuzzy-Mengen: die verschiedenen Modellierungsansätze sind so vielfältig wie die Problemstellungen, auf die sie angewendet werden. Verkehrssimulation, Populationswachstum, Wahlen oder Regelung sind nur einige der Anwendungsbereiche aus den Natur- und Ingenieurwissenschaften.
Diese Vorlesung bietet eine Einführung in die Grundlagen der Modellbildung und Simulation mit dem Ziel der Vorbereitung auf weiterführende Vorlesungen in diesem Bereich. Da Simulationsmethoden oft für viele verschiedene Problemklassen einsetzbar sind, ist die Vorlesung methodisch strukturiert. Den Hauptteil der Vorlesung bilden hierbei diskrete Modelle sowie deren Behandlung, aber auch kontinuierliche Modelle werden ergänzend gestreift. Ob diskrete Ereignissimulation, spieltheoretische Ansätze, Zelluläre Automaten, Räuber-Beute Modelle oder Fuzzy-Mengen: die verschiedenen Modellierungsansätze sind so vielfältig wie die Problemstellungen, auf die sie angewendet werden. Verkehrssimulation, Populationswachstum, Wahlen oder Regelung sind nur einige der Anwendungsbereiche aus den Natur- und Ingenieurwissenschaften.
Die Vorlesung vermittelt wie wesentliche Artefakte eines IT Systems modelliert werden, den Zusammenhang und das Zusammenspiel solcher Artefakte sowie die Rolle von Metamodellen und deren Erstellung. Die Vorlesung beinhaltet folgende Themen: - Entity-Relationship Modell und komplexe Objekte - Relationenmodell und Relationenalgebra , Überblick SQL - Transformationen von ER nach Relationen, Normalisierung - XML, DTD, XML-Schema, Info-Set, Namensräume - Metamodelle und Repository - RDF, RDF-S und Ontologien - UML - Petri Netze, Workflownetze - BPMN
Die Vorlesung vermittelt wie wesentliche Artefakte eines IT Systems modelliert werden, den Zusammenhang und das Zusammenspiel solcher Artefakte sowie die Rolle von Metamodellen und deren Erstellung. Die Vorlesung beinhaltet folgende Themen: - Entity-Relationship Modell und komplexe Objekte - Relationenmodell und Relationenalgebra , Überblick SQL - Transformationen von ER nach Relationen, Normalisierung - XML, DTD, XML-Schema, Info-Set, Namensräume - Metamodelle und Repository - RDF, RDF-S und Ontologien - UML - Petri Netze, Workflownetze - BPMN
Driven by the requirements of innovative applications and services, new technologies for networked and distributed systems have emerged in recent years. In the Internet of Things, a large number of devices and everyday objects are equipped with sensors and actuators that are networked via mostly wireless communication technologies (e.g. BLE, ZigBee, 6LoWPAN, LoRaWAN) and connected to the Internet. A corresponding trend can be found in the Industrial Internet of Things (IIoT, Industry 4.0), where machines, tools, logistics, etc. are connected. Flexibility and efficiency of distributed systems are being advanced through virtualisation (e.g. NFV) and software-defined systems (e.g. SDN), which enable flexible adaptation and dynamic scaling. Driven by the hype around Bitcoin, distributed ledger technologies (e.g. blockchain) and advanced concepts such as smart contracts have been developed. They not only serve as the basis for electronic currencies (e.g. Bitcoin, Ethereum), but also support other applications in which a consensus between parties must be reached and documented. Another challenge is to reduce latencies, for example by using nearby edge and fog resources in addition to remote cloud resources or by implementing optimised communication protocols that, among other things, enable a fast connection between client and server. Mobile (5G) communication technologies and systems have developed rapidly. For example, the Covid-19 tracking application uses mobile end user devices for contact tracing. This method, known as crowdsensing, can generally be used to collect large amounts of geographically distributed sensor data. In addition to these aspects, this seminar will discuss a wide range of current Internet technologies, protocols and standards that enable networked and distributed applications and services. Other possible topics include machine-to-machine communication (M2M), OPC-UA (Unified Architecture), real-time communication (Real-Time Ethernet), WWW technologies and protocols (HTTP 2.0/SPDY), and new transport protocols (QUIC, Multipath-TCP).
Driven by the requirements of innovative applications and services, new technologies for networked and distributed systems have emerged in recent years. In the Internet of Things, a large number of devices and everyday objects are equipped with sensors and actuators that are networked via mostly wireless communication technologies (e.g. BLE, ZigBee, 6LoWPAN, LoRaWAN) and connected to the Internet. A corresponding trend can be found in the Industrial Internet of Things (IIoT, Industry 4.0), where machines, tools, logistics, etc. are connected. Flexibility and efficiency of distributed systems are being advanced through virtualisation (e.g. NFV) and software-defined systems (e.g. SDN), which enable flexible adaptation and dynamic scaling. Driven by the hype around Bitcoin, distributed ledger technologies (e.g. blockchain) and advanced concepts such as smart contracts have been developed. They not only serve as the basis for electronic currencies (e.g. Bitcoin, Ethereum), but also support other applications in which a consensus between parties must be reached and documented. Another challenge is to reduce latencies, for example by using nearby edge and fog resources in addition to remote cloud resources or by implementing optimised communication protocols that, among other things, enable a fast connection between client and server. Mobile (5G) communication technologies and systems have developed rapidly. For example, the Covid-19 tracking application uses mobile end user devices for contact tracing. This method, known as crowdsensing, can generally be used to collect large amounts of geographically distributed sensor data. In addition to these aspects, this seminar will discuss a wide range of current Internet technologies, protocols and standards that enable networked and distributed applications and services. Other possible topics include machine-to-machine communication (M2M), OPC-UA (Unified Architecture), real-time communication (Real-Time Ethernet), WWW technologies and protocols (HTTP 2.0/SPDY), and new transport protocols (QUIC, Multipath-TCP).
Practical Course Information Systems: Data-intensive Computing
Semester:
SS 25
Inhalt:
This practical course mainly focuses on data-intensive applications and how they process, analyze and store their data. Nowadays, the amount of data in applications is growing intensely. This data can be either structured, e.g., stored in relational database systems such as MySQL, or unstructured, e.g., stored in text documents, pictures or other media files. Deriving information and, as a consequence, knowledge from this data is a huge challenge. To cope with this issue, new data analytics and data storage technologies have been developed that focus on the one hand on deriving information from this large amount of structured and unstructured data (e.g., data mining, natural language processing, ...) and on the other hand on efficient data storage (e.g., NoSQL databases, column-stores, ...). Modern web technologies are used to provide the web frontend and to visualize huge data sets and analytic results. Furthermore, to enable fast, efficient application development as well as a high degree of accessibility, flexibility and scalability, cloud computing platforms are oftentimes used to implement such applications.
In this practical course, you will work in small teams. Each team will implement a new or extend an existing data-intensive application based on tasks that typically involve data analysis, data visualization, data storage and efficient data processing. The used infrastructure, e.g. a cloud-based infrastructure like IBM Cloud, depends on the application.
Practical Course Information Systems: Data-intensive Computing
Semester:
SS 25
Inhalt:
This practical course mainly focuses on data-intensive applications and how they process, analyze and store their data. Nowadays, the amount of data in applications is growing intensely. This data can be either structured, e.g., stored in relational database systems such as MySQL, or unstructured, e.g., stored in text documents, pictures or other media files. Deriving information and, as a consequence, knowledge from this data is a huge challenge. To cope with this issue, new data analytics and data storage technologies have been developed that focus on the one hand on deriving information from this large amount of structured and unstructured data (e.g., data mining, natural language processing, ...) and on the other hand on efficient data storage (e.g., NoSQL databases, column-stores, ...). Modern web technologies are used to provide the web frontend and to visualize huge data sets and analytic results. Furthermore, to enable fast, efficient application development as well as a high degree of accessibility, flexibility and scalability, cloud computing platforms are oftentimes used to implement such applications.
In this practical course, you will work in small teams. Each team will implement a new or extend an existing data-intensive application based on tasks that typically involve data analysis, data visualization, data storage and efficient data processing. The used infrastructure, e.g. a cloud-based infrastructure like IBM Cloud, depends on the application.
The Internet of Everything (IoE), where virtually everything can now communicate through the Internet, and the increasingly demanding performance requirements of new technologies (e.g., cryptocurrencies) have driven the emergence of new computing paradigms for distributed systems. Scalability is now offered not only by centralized cloud providers, but also by edge computing systems, where geographically distributed servers provide computational resources at the edge of the network and, therefore, close to the end devices. This can significantly reduce latency for time-critical applications like vehicular networks. The advances in edge computing have led to the emergence of edge AI, where powerful AI algorithms are deployed at the edge, without relying on a remote cloud. But distributed systems come with many challenges which requires a profound understanding of core principles in distributed computing. As pointed out by former Google Senior Vice President Urs Hölzl: “At scale, everything breaks ... Keeping things simple and yet scalable is actually the biggest challenge. It's really, really hard.“ This is especially true for dynamic and uncertain environments that we are facing, for instance, in smart buildings or smart energy systems. Self-adaptation is one of the key mechanisms for coping with increasingly large and dynamic systems, often by using machine learning techniques (GNN, reinforcement learning). Challenges that come with distributed storage systems include consistency and scalability. Another hot topic, especially in the context of 5G and the development of future 6G networks, is Time Sensitive Networking (TSN), which defines a set of standards to enable reliable, deterministic real-time communication in Ethernet networks. These standards target, among others, time synchronization and traffic shaping/scheduling approaches for both event-based and time-triggered traffic. In this seminar, we take a deep dive into specific concepts of distributed and context-aware systems that tackle the above challenges. The topics will be published on the department’s website and are assigned according to a standardized procedure as explained during the kick-off.
The Internet of Everything (IoE), where virtually everything can now communicate through the Internet, and the increasingly demanding performance requirements of new technologies (e.g., cryptocurrencies) have driven the emergence of new computing paradigms for distributed systems. Scalability is now offered not only by centralized cloud providers, but also by edge computing systems, where geographically distributed servers provide computational resources at the edge of the network and, therefore, close to the end devices. This can significantly reduce latency for time-critical applications like vehicular networks. The advances in edge computing have led to the emergence of edge AI, where powerful AI algorithms are deployed at the edge, without relying on a remote cloud. But distributed systems come with many challenges which requires a profound understanding of core principles in distributed computing. As pointed out by former Google Senior Vice President Urs Hölzl: “At scale, everything breaks ... Keeping things simple and yet scalable is actually the biggest challenge. It's really, really hard.“ This is especially true for dynamic and uncertain environments that we are facing, for instance, in smart buildings or smart energy systems. Self-adaptation is one of the key mechanisms for coping with increasingly large and dynamic systems, often by using machine learning techniques (GNN, reinforcement learning). Challenges that come with distributed storage systems include consistency and scalability. Another hot topic, especially in the context of 5G and the development of future 6G networks, is Time Sensitive Networking (TSN), which defines a set of standards to enable reliable, deterministic real-time communication in Ethernet networks. These standards target, among others, time synchronization and traffic shaping/scheduling approaches for both event-based and time-triggered traffic. In this seminar, we take a deep dive into specific concepts of distributed and context-aware systems that tackle the above challenges. The topics will be published on the department’s website and are assigned according to a standardized procedure as explained during the kick-off.