Proceedings of the Fifteenth European Conference on Computer Systems, 2020
Data transfers impose a major bottleneck in heterogenous system architectures. As a mitigation st... more Data transfers impose a major bottleneck in heterogenous system architectures. As a mitigation strategy, compute resources can be introduced in places where data occurs naturally. The increased diversity of compute resources in turn affects programming models and practicalities of software development for near-data compute kernels and raises the question of how those resources can be made accessible to users and applications. We introduce the Metal FS framework to improve the accessibility of FPGA-based near-storage accelerators: Firstly, we present a near-storage-compute-aware file system that enables self-contained, reusable compute kernels to operate on the granularity of file data streams. Secondly, we provide an integrated build process for FPGA overlay images that starts with the acquisition of compute kernels through a package manager and finally allows to dynamically configure near-storage compute pipelines consisting of them. Thirdly, we integrate the framework into Linux as a file system driver and repurpose Unix Pipes as a well-known operating system primitive to orchestrate near-storage compute pipelines. CCS Concepts • Information systems → Storage architectures; • Software and its engineering → Development frameworks and environments.
GPU compute devices have become very popular for general purpose computations. However, the SIMD-... more GPU compute devices have become very popular for general purpose computations. However, the SIMD-like hardware of graphics processors is currently not well suited for irregular workloads, like searching unbalanced trees. In order to mitigate this drawback, NVIDIA introduced an extension to GPU programming models called dynamic parallelism. This extension enables GPU programs to spawn new units of work directly on the GPU, allowing the refinement of subsequent work items based on intermediate results without any involvement of the main CPU. This work investigates methods for employing dynamic parallelism with the goal of improved workload distribution for tree search algorithms on modern GPU hardware. For the evaluation of the proposed approaches, a case study is conducted on the n-queens problem. Extensive benchmarks indicate that the benefits of improved resource utilization fail to outweigh high management overhead and runtime limitations due to the very fine level of granularity of the investigated problem. However, novel memory management concepts for passing parameters to child grids are presented. These general concepts are applicable to other, more coarse-grained problems that benefit from the use of dynamic parallelism.
The continuous testing of small changes to systems has proven to be useful and is widely adopted ... more The continuous testing of small changes to systems has proven to be useful and is widely adopted in the development of software systems. For this, software is tested in environments that are as close as possible to the production environments. When testing IoT systems, this approach is met with unique challenges that stem from the typically large scale of the deployments, heterogeneity of nodes, challenging network characteristics, and tight integration with the environment among others. IoT test environments present a possible solution to these challenges by emulating the nodes, networks, and possibly domain environments in which IoT applications can be executed. This paper gives an overview of the state of the art in IoT testing. We derive desirable characteristics of IoT test environments, compare 18 tools that can be used in this respect, and give a research outlook of future trends in this area.
2017 Fifth International Symposium on Computing and Networking (CANDAR), 2017
With memory-centric architectures appearing on the horizon as potential candidates for future com... more With memory-centric architectures appearing on the horizon as potential candidates for future computer architectures, we propose that the tuple space paradigm is well suited for the task of managing the large shared memory pools that are a central concept of these new architectures. We support this hypothesis by presenting MemSpaces, an implementation of the tuple space paradigm based on POSIX shared memory objects. To demonstrate both efficacy and efficiency of the approach, we provide a performance evaluation that compares MemSpaces to message-based implementations of the tuple space paradigm. Due to the lack of commercial availability of adequate hardware, we perform the evaluation inside an emulated environment that mimics the general characteristics of memory-centric architectures. For many operations, MemSpaces performs an order of magnitude faster compared to state of the art implementations.
Contemporary distributed computing systems may provide high computing power combined with upcomin... more Contemporary distributed computing systems may provide high computing power combined with upcoming new networking technologies. However, until now network-based parallel systems which employ interconnected computers (PC's, workstations, mainframes) as processing ...
The Object Management Group's(OMG) Common Object Request Broker Architecture (CORBA) is an im... more The Object Management Group's(OMG) Common Object Request Broker Architecture (CORBA) is an important and popular technology that supports the development of object-based, distributed applications. The benefits of abstraction promised by CORBA(location transparency, heterogeneity,dynamic configuration, etc.) are appealing in manyapplication domains, including those that satisfy real-time requirements --- such as manufacturing, process control, and transport systems. Furthermore, those attributes makeCORBAaninteresting candidate for responsive (fault-tolerant, real-time) cluster computing. However, the specification of timing behavior and quality-of-service parameters likecommunication latency and acceptable processor utilization is beyond the scope of today'sCORBA. Here, we present the "Composite Objects"approach for predictable integration of CORBA with real-time requirements. We discuss data replication and weak memory consistencyasthe key concepts for implementat...
This paper introduces an artificial neural networks (ANN) based framework for joint demosaicing o... more This paper introduces an artificial neural networks (ANN) based framework for joint demosaicing of color field array (CFA) raw image sequences. We propose an algorithm that offers superior resolution, signal to noise ratio and dynamic range when compared to single-frame demosaicing. A rich set of both synthetic and real world experimental results illustrates its capabilities.
Database and Expert Systems Applications, 2004. …, 2004
This paper reflects different understanding and positions on future trends of GRID-oriented techn... more This paper reflects different understanding and positions on future trends of GRID-oriented technologies, applications, and networks, as perceived by representatives from industry and academia. There is no definitive answer on the topic that is raised in the title. Instead, the ...
Abstract: Die Anforderungen an Anwendungen für IT-gestütztes Management eines MANV sind vielfälti... more Abstract: Die Anforderungen an Anwendungen für IT-gestütztes Management eines MANV sind vielfältig. Neben Aspekten der Ergonomie (Hardware, Benutzeroberflächen) müssen organisatorische Fragen berücksichtigt werden. Die Nutzer sind im MANV besonders stressreichen Situationen ausgesetzt, und die Einstellung zur Technik und der Umgang mit neuen Technologien können eine entscheidende Rolle spielen. Bei einem MANV muss zudem davon ausgegangen werden, dass Kommunikationsinfrastruktur gar nicht oder nur ...
Almost a year ago, Microsoft has introduced the .NET architecture as a new component-based progra... more Almost a year ago, Microsoft has introduced the .NET architecture as a new component-based programming environment, which allows for easy integration of classical distributed programming techniques with Web computing. .NETdefines a type system and introduces notions such ...
Proceedings of the Fifteenth European Conference on Computer Systems, 2020
Data transfers impose a major bottleneck in heterogenous system architectures. As a mitigation st... more Data transfers impose a major bottleneck in heterogenous system architectures. As a mitigation strategy, compute resources can be introduced in places where data occurs naturally. The increased diversity of compute resources in turn affects programming models and practicalities of software development for near-data compute kernels and raises the question of how those resources can be made accessible to users and applications. We introduce the Metal FS framework to improve the accessibility of FPGA-based near-storage accelerators: Firstly, we present a near-storage-compute-aware file system that enables self-contained, reusable compute kernels to operate on the granularity of file data streams. Secondly, we provide an integrated build process for FPGA overlay images that starts with the acquisition of compute kernels through a package manager and finally allows to dynamically configure near-storage compute pipelines consisting of them. Thirdly, we integrate the framework into Linux as a file system driver and repurpose Unix Pipes as a well-known operating system primitive to orchestrate near-storage compute pipelines. CCS Concepts • Information systems → Storage architectures; • Software and its engineering → Development frameworks and environments.
GPU compute devices have become very popular for general purpose computations. However, the SIMD-... more GPU compute devices have become very popular for general purpose computations. However, the SIMD-like hardware of graphics processors is currently not well suited for irregular workloads, like searching unbalanced trees. In order to mitigate this drawback, NVIDIA introduced an extension to GPU programming models called dynamic parallelism. This extension enables GPU programs to spawn new units of work directly on the GPU, allowing the refinement of subsequent work items based on intermediate results without any involvement of the main CPU. This work investigates methods for employing dynamic parallelism with the goal of improved workload distribution for tree search algorithms on modern GPU hardware. For the evaluation of the proposed approaches, a case study is conducted on the n-queens problem. Extensive benchmarks indicate that the benefits of improved resource utilization fail to outweigh high management overhead and runtime limitations due to the very fine level of granularity of the investigated problem. However, novel memory management concepts for passing parameters to child grids are presented. These general concepts are applicable to other, more coarse-grained problems that benefit from the use of dynamic parallelism.
The continuous testing of small changes to systems has proven to be useful and is widely adopted ... more The continuous testing of small changes to systems has proven to be useful and is widely adopted in the development of software systems. For this, software is tested in environments that are as close as possible to the production environments. When testing IoT systems, this approach is met with unique challenges that stem from the typically large scale of the deployments, heterogeneity of nodes, challenging network characteristics, and tight integration with the environment among others. IoT test environments present a possible solution to these challenges by emulating the nodes, networks, and possibly domain environments in which IoT applications can be executed. This paper gives an overview of the state of the art in IoT testing. We derive desirable characteristics of IoT test environments, compare 18 tools that can be used in this respect, and give a research outlook of future trends in this area.
2017 Fifth International Symposium on Computing and Networking (CANDAR), 2017
With memory-centric architectures appearing on the horizon as potential candidates for future com... more With memory-centric architectures appearing on the horizon as potential candidates for future computer architectures, we propose that the tuple space paradigm is well suited for the task of managing the large shared memory pools that are a central concept of these new architectures. We support this hypothesis by presenting MemSpaces, an implementation of the tuple space paradigm based on POSIX shared memory objects. To demonstrate both efficacy and efficiency of the approach, we provide a performance evaluation that compares MemSpaces to message-based implementations of the tuple space paradigm. Due to the lack of commercial availability of adequate hardware, we perform the evaluation inside an emulated environment that mimics the general characteristics of memory-centric architectures. For many operations, MemSpaces performs an order of magnitude faster compared to state of the art implementations.
Contemporary distributed computing systems may provide high computing power combined with upcomin... more Contemporary distributed computing systems may provide high computing power combined with upcoming new networking technologies. However, until now network-based parallel systems which employ interconnected computers (PC's, workstations, mainframes) as processing ...
The Object Management Group's(OMG) Common Object Request Broker Architecture (CORBA) is an im... more The Object Management Group's(OMG) Common Object Request Broker Architecture (CORBA) is an important and popular technology that supports the development of object-based, distributed applications. The benefits of abstraction promised by CORBA(location transparency, heterogeneity,dynamic configuration, etc.) are appealing in manyapplication domains, including those that satisfy real-time requirements --- such as manufacturing, process control, and transport systems. Furthermore, those attributes makeCORBAaninteresting candidate for responsive (fault-tolerant, real-time) cluster computing. However, the specification of timing behavior and quality-of-service parameters likecommunication latency and acceptable processor utilization is beyond the scope of today'sCORBA. Here, we present the "Composite Objects"approach for predictable integration of CORBA with real-time requirements. We discuss data replication and weak memory consistencyasthe key concepts for implementat...
This paper introduces an artificial neural networks (ANN) based framework for joint demosaicing o... more This paper introduces an artificial neural networks (ANN) based framework for joint demosaicing of color field array (CFA) raw image sequences. We propose an algorithm that offers superior resolution, signal to noise ratio and dynamic range when compared to single-frame demosaicing. A rich set of both synthetic and real world experimental results illustrates its capabilities.
Database and Expert Systems Applications, 2004. …, 2004
This paper reflects different understanding and positions on future trends of GRID-oriented techn... more This paper reflects different understanding and positions on future trends of GRID-oriented technologies, applications, and networks, as perceived by representatives from industry and academia. There is no definitive answer on the topic that is raised in the title. Instead, the ...
Abstract: Die Anforderungen an Anwendungen für IT-gestütztes Management eines MANV sind vielfälti... more Abstract: Die Anforderungen an Anwendungen für IT-gestütztes Management eines MANV sind vielfältig. Neben Aspekten der Ergonomie (Hardware, Benutzeroberflächen) müssen organisatorische Fragen berücksichtigt werden. Die Nutzer sind im MANV besonders stressreichen Situationen ausgesetzt, und die Einstellung zur Technik und der Umgang mit neuen Technologien können eine entscheidende Rolle spielen. Bei einem MANV muss zudem davon ausgegangen werden, dass Kommunikationsinfrastruktur gar nicht oder nur ...
Almost a year ago, Microsoft has introduced the .NET architecture as a new component-based progra... more Almost a year ago, Microsoft has introduced the .NET architecture as a new component-based programming environment, which allows for easy integration of classical distributed programming techniques with Web computing. .NETdefines a type system and introduces notions such ...
Uploads
Papers by A. Polze