... 4). Tak-ing the oneslot system into account, a controller module must be reconfigured and exe... more ... 4). Tak-ing the oneslot system into account, a controller module must be reconfigured and executed within one sample pe-riod. Here, tonesiot (eq. ... Computing the outer sum sequentially is called 1-BAAT (one bit at a time) and takes w clock cycles. ...
There has been significant interest in hardware-assisted deterministic Record and Replay (RnR) sy... more There has been significant interest in hardware-assisted deterministic Record and Replay (RnR) systems for multithreaded programs on multiprocessors. However, no proposal has implemented this technique in a hardware prototype with full operating system support. Such an implementation is needed to assess RnR practicality. This paper presents QuickRec, the first multicore Intel Architecture (IA) prototype of RnR for multithreaded programs. QuickRec is based on QuickIA, an Intel emulation platform for rapid prototyping of new IA extensions. QuickRec is composed of a Xeon server platform with FPGA-emulated second-generation Pentium cores, and Capo3, a full software stack for managing the recording hardware from within a modified Linux kernel. This paper's focus is understanding and evaluating the implementation issues of RnR on a real platform. Our effort leads to some lessons learned, as well as to some pointers for future research. We demonstrate that RnR can be implemented efficiently on a real multicore IA system. In particular, we show that the rate of memory log generation is insignificant, and that the recording hardware has negligible performance overhead. However, the software stack incurs an average recording overhead of nearly 13%, which must be reduced to enable always-on use of RnR.
Control systems can be implemented in reconfigurable hardware as an efficient and high-performanc... more Control systems can be implemented in reconfigurable hardware as an efficient and high-performance alternative to control algorithms executed by processors [4, 7, 10]. The large design space offered by reconfigurable hardware allows an exploration of different area/time trade-offs and to ...
Joint International Conference on Autonomic and Autonomous Systems and International Conference on Networking and Services - (icas-isns'05), 2005
This paper presents the architecture of an operating system (called NanoOS) for applications dist... more This paper presents the architecture of an operating system (called NanoOS) for applications distributed over mobile ad hoc networks (MANETs). Furthermore, a service distribution method inspired on the foraging behavior of ants is proposed. The NanoOS offers an uniform ...
We introduce the concept of an operating system for platforms that consist beside memory and peri... more We introduce the concept of an operating system for platforms that consist beside memory and peripheral devices of FPGAs as the only computational resource. Applications can be developed independent from each other and due to device drivers with little dependency on the platform. The OS supports the multitasking execution of applications using static as well as dynamic resource assignment. A main focus of the paper is the management of the resource memory. Memory management as part of the OS is introduced which allows multiple tasks to access the same memory banks using virtual addressing and dynamic memory allocation. Access conflicts are solved by a priority based scheduling. Since no microprocessor is part of the system, the entire OS including its memory management is executed on the FPGAs. 1
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006
Reconfigurable hardware devices, such as FPGAs, are increasingly used in embedded systems. To uti... more Reconfigurable hardware devices, such as FPGAs, are increasingly used in embedded systems. To utilize these devices for real-time work loads, scheduling techniques are required that generate predictable task timings. In this paper, we present a partitioning-EDF (earliest deadline first) approach to find such schedules. The FPGA area is partitioned along one dimension into slots. The tasks are partitioned into groups. Then, each group is scheduled to exactly one slot using the EDF rule. We show that the problem of finding an optimal partitioning is related to the well-known 2-dimensional level bin-packing problem. We extend a previously reported ILP model to solve our partitioning problem to optimality. By a simulation study we demonstrate that the partitioning-EDF approach is able to find feasible schedules for most task sets with a system utilization of up to 70%. Additionally, we allow a task to be realized in alternative implementations. A simulation study reveals that the scheduling performance increases considerably if three instead of one task variants are considered. Finally, we model and study the impact of the device reconfiguration time on the scheduling performance.
Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798)
We present a new approach for reconfigurable massively parallel computers. The approach uses FPGA... more We present a new approach for reconfigurable massively parallel computers. The approach uses FPGA as reconfigurable device to build parallel computers which can adapt their physical topology to match the virtual topology used to model the parallel computation paradigm of a given application. We use a case study in which a virtual ring topology is first simulated on a tree topology and then directly implemented in an FPGA configuration. Preliminary results show that we can increase the performance of the parallel computers which make use of message passing interface by a factor of up to 20 % if a reconfigurable topology approach is used.
Proceedings of the 17th symposium on Integrated circuits and system design - SBCCI '04, 2004
The partial runtime reconfiguration capability of FPGAs allows task execution in a multitasking m... more The partial runtime reconfiguration capability of FPGAs allows task execution in a multitasking manner. In contrasts to most other models, we assume that each task has several implementation variants with different performance and size. Moreover, one task variant is an extension of another. Therefore, a task can change between its variants without reconfiguring the entire task footprint. As case study, we introduce an online scalable distributed arithmetic design and review the advantages.
In this paper, we consider the scheduling of periodic real-time tasks on reconfigurable hardware ... more In this paper, we consider the scheduling of periodic real-time tasks on reconfigurable hardware devices. Such devices can execute several tasks in parallel. All executing tasks share the hardware resource, which makes the scheduling problem differ from single- and multiprocessor scheduling. We adapt the global EDF multiprocessor scheduling approach to the reconfigurable hardware execution model and define two preemptive scheduling algorithms, EDF-First-k-Fit and EDF-Next-Fit . For these algorithms, we present a novel linear-time schedulability test and give a proof based on a resource augmentation technique. Then, we propose a task placement and relocation scheme utilizing partial device reconfiguration. This scheme allows us to extend the schedulability test to include reconfiguration time overheads. Experiments with synthetic workloads compare the scheduling test with the actual scheduling performance of EDF-First-k-Fit and EDF-Next-Fit . The main evaluation result is that the re...
2006 International Conference on Field Programmable Logic and Applications, 2006
This paper presents a prototype system that executes a set of periodic real-time tasks utilizing ... more This paper presents a prototype system that executes a set of periodic real-time tasks utilizing dynamic hardware reconfiguration. The proposed scheduling technique, MSDL, is not only able to give an offline guarantee for the feasibility of the task set but also minimizes the number of device configurations. After describing this technique, we extend the schedulability analysis to include different runtime system overheads, including the device reconfiguration time. Then we detail a lightweight runtime system that performs the online part of the MSDL scheduling technique. The runtime system is entirely implemented in hardware. Finally, we outline the corresponding synthesis tool flow and report on the overhead posed by the runtime system.
... 4). Tak-ing the oneslot system into account, a controller module must be reconfigured and exe... more ... 4). Tak-ing the oneslot system into account, a controller module must be reconfigured and executed within one sample pe-riod. Here, tonesiot (eq. ... Computing the outer sum sequentially is called 1-BAAT (one bit at a time) and takes w clock cycles. ...
There has been significant interest in hardware-assisted deterministic Record and Replay (RnR) sy... more There has been significant interest in hardware-assisted deterministic Record and Replay (RnR) systems for multithreaded programs on multiprocessors. However, no proposal has implemented this technique in a hardware prototype with full operating system support. Such an implementation is needed to assess RnR practicality. This paper presents QuickRec, the first multicore Intel Architecture (IA) prototype of RnR for multithreaded programs. QuickRec is based on QuickIA, an Intel emulation platform for rapid prototyping of new IA extensions. QuickRec is composed of a Xeon server platform with FPGA-emulated second-generation Pentium cores, and Capo3, a full software stack for managing the recording hardware from within a modified Linux kernel. This paper's focus is understanding and evaluating the implementation issues of RnR on a real platform. Our effort leads to some lessons learned, as well as to some pointers for future research. We demonstrate that RnR can be implemented efficiently on a real multicore IA system. In particular, we show that the rate of memory log generation is insignificant, and that the recording hardware has negligible performance overhead. However, the software stack incurs an average recording overhead of nearly 13%, which must be reduced to enable always-on use of RnR.
Control systems can be implemented in reconfigurable hardware as an efficient and high-performanc... more Control systems can be implemented in reconfigurable hardware as an efficient and high-performance alternative to control algorithms executed by processors [4, 7, 10]. The large design space offered by reconfigurable hardware allows an exploration of different area/time trade-offs and to ...
Joint International Conference on Autonomic and Autonomous Systems and International Conference on Networking and Services - (icas-isns'05), 2005
This paper presents the architecture of an operating system (called NanoOS) for applications dist... more This paper presents the architecture of an operating system (called NanoOS) for applications distributed over mobile ad hoc networks (MANETs). Furthermore, a service distribution method inspired on the foraging behavior of ants is proposed. The NanoOS offers an uniform ...
We introduce the concept of an operating system for platforms that consist beside memory and peri... more We introduce the concept of an operating system for platforms that consist beside memory and peripheral devices of FPGAs as the only computational resource. Applications can be developed independent from each other and due to device drivers with little dependency on the platform. The OS supports the multitasking execution of applications using static as well as dynamic resource assignment. A main focus of the paper is the management of the resource memory. Memory management as part of the OS is introduced which allows multiple tasks to access the same memory banks using virtual addressing and dynamic memory allocation. Access conflicts are solved by a priority based scheduling. Since no microprocessor is part of the system, the entire OS including its memory management is executed on the FPGAs. 1
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006
Reconfigurable hardware devices, such as FPGAs, are increasingly used in embedded systems. To uti... more Reconfigurable hardware devices, such as FPGAs, are increasingly used in embedded systems. To utilize these devices for real-time work loads, scheduling techniques are required that generate predictable task timings. In this paper, we present a partitioning-EDF (earliest deadline first) approach to find such schedules. The FPGA area is partitioned along one dimension into slots. The tasks are partitioned into groups. Then, each group is scheduled to exactly one slot using the EDF rule. We show that the problem of finding an optimal partitioning is related to the well-known 2-dimensional level bin-packing problem. We extend a previously reported ILP model to solve our partitioning problem to optimality. By a simulation study we demonstrate that the partitioning-EDF approach is able to find feasible schedules for most task sets with a system utilization of up to 70%. Additionally, we allow a task to be realized in alternative implementations. A simulation study reveals that the scheduling performance increases considerably if three instead of one task variants are considered. Finally, we model and study the impact of the device reconfiguration time on the scheduling performance.
Proceedings. 2003 IEEE International Conference on Field-Programmable Technology (FPT) (IEEE Cat. No.03EX798)
We present a new approach for reconfigurable massively parallel computers. The approach uses FPGA... more We present a new approach for reconfigurable massively parallel computers. The approach uses FPGA as reconfigurable device to build parallel computers which can adapt their physical topology to match the virtual topology used to model the parallel computation paradigm of a given application. We use a case study in which a virtual ring topology is first simulated on a tree topology and then directly implemented in an FPGA configuration. Preliminary results show that we can increase the performance of the parallel computers which make use of message passing interface by a factor of up to 20 % if a reconfigurable topology approach is used.
Proceedings of the 17th symposium on Integrated circuits and system design - SBCCI '04, 2004
The partial runtime reconfiguration capability of FPGAs allows task execution in a multitasking m... more The partial runtime reconfiguration capability of FPGAs allows task execution in a multitasking manner. In contrasts to most other models, we assume that each task has several implementation variants with different performance and size. Moreover, one task variant is an extension of another. Therefore, a task can change between its variants without reconfiguring the entire task footprint. As case study, we introduce an online scalable distributed arithmetic design and review the advantages.
In this paper, we consider the scheduling of periodic real-time tasks on reconfigurable hardware ... more In this paper, we consider the scheduling of periodic real-time tasks on reconfigurable hardware devices. Such devices can execute several tasks in parallel. All executing tasks share the hardware resource, which makes the scheduling problem differ from single- and multiprocessor scheduling. We adapt the global EDF multiprocessor scheduling approach to the reconfigurable hardware execution model and define two preemptive scheduling algorithms, EDF-First-k-Fit and EDF-Next-Fit . For these algorithms, we present a novel linear-time schedulability test and give a proof based on a resource augmentation technique. Then, we propose a task placement and relocation scheme utilizing partial device reconfiguration. This scheme allows us to extend the schedulability test to include reconfiguration time overheads. Experiments with synthetic workloads compare the scheduling test with the actual scheduling performance of EDF-First-k-Fit and EDF-Next-Fit . The main evaluation result is that the re...
2006 International Conference on Field Programmable Logic and Applications, 2006
This paper presents a prototype system that executes a set of periodic real-time tasks utilizing ... more This paper presents a prototype system that executes a set of periodic real-time tasks utilizing dynamic hardware reconfiguration. The proposed scheduling technique, MSDL, is not only able to give an offline guarantee for the feasibility of the task set but also minimizes the number of device configurations. After describing this technique, we extend the schedulability analysis to include different runtime system overheads, including the device reconfiguration time. Then we detail a lightweight runtime system that performs the online part of the MSDL scheduling technique. The runtime system is entirely implemented in hardware. Finally, we outline the corresponding synthesis tool flow and report on the overhead posed by the runtime system.
Uploads
Papers by Klaus Danne