Print indd



Download 18,42 Mb.
Pdf ko'rish
bet366/366
Sana31.12.2021
Hajmi18,42 Mb.
#276933
1   ...   358   359   360   361   362   363   364   365   366
Bog'liq
(Lecture Notes in Computer Science 10793) Mladen Berekovic, Rainer Buchty, Heiko Hamann, Dirk Koch, Thilo Pionteck - Architecture of Computing Systems – ARCS

Document Outline

  • Preface
  • Organization
  • Biologically-Inspired Massively-Parallel Computation (Keynote Talk)
  • Contents
  • Embedded Systems
  • Trade-Off Between Performance, Fault Tolerance and Energy Consumption in Duplication-Based Taskgraph Scheduling
    • 1 Introduction
    • 2 The Trade-Off Problem
    • 3 Fault Tolerant and Energy Efficient Scheduling
      • 3.1 Previous Approach
      • 3.2 Extensions
    • 4 Runtime System
      • 4.1 System Check Tool
      • 4.2 Scheduler and User Preferences
      • 4.3 Runtime System
    • 5 Power Model
      • 5.1 Model Validation
      • 5.2 Real-World Evaluation
    • 6 Experimental Results
    • 7 Conclusions
    • References
  • Lipsi: Probably the Smallest Processor in the World
    • 1 Introduction
    • 2 Related Work
    • 3 The Lipsi Design
      • 3.1 The Datapath
      • 3.2 The Instruction Set
      • 3.3 Implementation and Assembly in Hardware
      • 3.4 Simulation and Testing
      • 3.5 Developing a Processor
    • 4 Evaluation and Discussion
      • 4.1 Resource Consumption
      • 4.2 The Smallest Processor?
      • 4.3 A Lipsi Manycore Processor
      • 4.4 Lipsi in Teaching
      • 4.5 Source Access
    • 5 Conclusion
    • References
  • Superlinear Scalability in Parallel Computing and Multi-robot Systems: Shared Resources, Collaboration, and Network Topology
    • 1 Introduction
      • 1.1 Superlinear Performance in Multi-robot Systems
      • 1.2 Universal Scalability Law
    • 2 Unified Interpretation Across Fields of Research
    • 3 Results
      • 3.1 Stick Pulling: Shared Resources and Collaboration
      • 3.2 Parallel Optimization: Network Topologies and Information Flow
    • 4 Discussion and Conclusion
    • References
  • Multicore Systems
  • Closed Loop Controller for Multicore Real-Time Systems
    • 1 Introduction
    • 2 Related Work
    • 3 Closed Performance Control Loop
      • 3.1 Basic Fingerprinting
      • 3.2 Pulse Width Modulated Interferences
      • 3.3 Closed Loop Controller
    • 4 Evaluation
      • 4.1 PWM Effectiveness
      • 4.2 Closed Loop Controller
    • 5 Conclusion
    • References
  • Optimization of the GNU OpenMP Synchronization Barrier in MPSoC
    • 1 Introduction
    • 2 Related Work
    • 3 The GNU OpenMP Synchronization Barrier Mechanism
      • 3.1 Code Parallelization and Synchronization
      • 3.2 Active Wait and GNU OpenMP Policy
    • 4 Experimentation Environment
    • 5 Active Wait Optimization for GNU OpenMP Synchronization Barrier
      • 5.1 Barrier Mechanism Measurements and Study
      • 5.2 Optimization Proposal
      • 5.3 Micro-benchmark Results
      • 5.4 Performances Evaluation on the NAS Benchmark IS Application
    • 6 Conclusion
    • References
  • Analysis and Optimization
  • Ampehre: An Open Source Measurement Framework for Heterogeneous Compute Nodes
    • 1 Introduction
    • 2 Architecture and Components of Ampehre
      • 2.1 Extended PAPI Library
      • 2.2 Ampehre Library API
      • 2.3 Ampehre Tools
    • 3 Example: Measuring Energy on CPU and GPU
    • 4 Balancing Accuracy and Overhead
    • 5 Availability and Extensibility of Ampehre
    • 6 Conclusion
    • References
  • A Hybrid Approach for Runtime Analysis Using a Cycle and Instruction Accurate Model
    • 1 Introduction
    • 2 Related Work
    • 3 Proposed Methodology
      • 3.1 Analyzing the Program
      • 3.2 Running the Simulation
    • 4 Evaluation
      • 4.1 Metric
      • 4.2 Results
    • 5 Conclusion
    • References
  • On-chip and Off-chip Networks
  • A CAM-Free Exascalable HPC Router for Low-Energy Communications
    • 1 Introduction
    • 2 Related Work
    • 3 ExaNeSt System Architecture
      • 3.1 Router Architecture
      • 3.2 Routing Algorithms
    • 4 Evaluation
      • 4.1 Experimental Setup
      • 4.2 Area
      • 4.3 Power Consumption
      • 4.4 Performance
    • 5 Conclusions and Future Work
    • References
  • Lightweight Hardware Synchronization for Avoiding Buffer Overflows in Network-on-Chips
    • 1 Introduction
    • 2 Related Work and Background
    • 3 Synchronization Concept
    • 4 Hardware Supported ready Synchronization
      • 4.1 Hardware Implementation
      • 4.2 New Instructions
      • 4.3 Impact of Ready Synchronization on Hardware Size
    • 5 Evaluation
      • 5.1 Comparison of Ready Synchronization in Software and Hardware
      • 5.2 Execution Times
      • 5.3 Impact on Hardware Costs
    • 6 Conclusion
    • References
  • -1Network Optimization for Safety-Critical Systems Using Software-Defined Networks
    • 1 Introduction
    • 2 Related Work
    • 3 Problem Formulation
    • 4 Experimental Setup
      • 4.1 Assumptions
      • 4.2 Baseline
    • 5 Numerical Results and Discussion
      • 5.1 Standard Networks
      • 5.2 Critical Networks
    • 6 Conclusion and Future Work
    • References
  • CaCAO: Complex and Compositional Atomic Operations for NoC-Based Manycore Platforms
    • 1 Introduction
    • 2 Related Work
    • 3 Complex and Compositional Atomic Operations
      • 3.1 Comparison of the Synchronization Primitives () and ()
      • 3.2 CaCAO Approach ()
    • 4 Implementation Aspects
    • 5 Experimental Setup and Results
    • 6 Conclusion and Future Work
    • References
  • Memory Models and Systems
  • Redundant Execution on Heterogeneous Multi-cores Utilizing Transactional Memory
    • 1 Introduction
    • 2 Related Work
    • 3 Transaction-Based Redundant Execution Model
      • 3.1 Loosely-Coupled Redundancy with Checkpoints
      • 3.2 Extension of HTM to Support Fault Tolerance
      • 3.3 Heterogeneous Redundant Systems
    • 4 Evaluation
    • 5 Conclusion
    • References
  • Improving the Performance of STT-MRAM LLC Through Enhanced Cache Replacement Policy
    • 1 Introduction
    • 2 Related Work
    • 3 Motivation and Approach
      • 3.1 Motivational Example
      • 3.2 Writes Operations at Last-Level Cache
      • 3.3 Cache Replacement Policy
    • 4 Experimental Results
      • 4.1 Environment Setup
      • 4.2 Results
    • 5 Conclusion and Perspectives
    • References
  • On Automated Feedback-Driven Data Placement in Multi-tiered Memory
    • 1 Introduction
    • 2 Related Work
    • 3 Feedback-Driven Data Placement for Hybrid Memories
      • 3.1 Allocation Site Partitioning
      • 3.2 Profile-Guided Management
    • 4 Implementation Details
      • 4.1 Associating Memory Usage Profiles with Program Allocation Sites
      • 4.2 Hybrid Memory Management
    • 5 Experimental Framework
      • 5.1 Simulation Platform
      • 5.2 Benchmarks Description
    • 6 Evaluation
      • 6.1 Baseline Configurations
      • 6.2 Static Application Guidance
      • 6.3 Adaptive Application Guidance
      • 6.4 Comparison with OS/Architectural Reactive Profiling
      • 6.5 Performance Summary
    • 7 Conclusions and Future Work
    • References
  • Operational Characterization of Weak Memory Consistency Models
    • 1 Introduction
    • 2 Related Work
    • 3 View-Based Definitions of Memory Consistency Models
      • 3.1 Local Consistency
      • 3.2 Cache Consistency (CC)
      • 3.3 Pipelined-RAM (PRAM) Consistency
      • 3.4 Sequential Consistency (SC)
    • 4 Operational Definitions of Memory Consistency Models
      • 4.1 Basic Components
      • 4.2 Reference Machine for Local Consistency
      • 4.3 Reference Machine for Cache Consistency
      • 4.4 Reference Machine for PRAM Consistency
      • 4.5 Reference Machine for Sequential Consistency
      • 4.6 Implementation of Reference Machines
    • 5 Conclusions and Future Work
    • References
  • Energy Efficient Systems
  • A Tightly Coupled Heterogeneous Core with Highly Efficient Low-Power Mode
    • 1 Introduction
    • 2 Existing TCHC Architecture
      • 2.1 Composite Core
      • 2.2 Front-End Execution Architecture
    • 3 Dual-Mode Front-End Execution Architecture
      • 3.1 Implementation of LP Mode
      • 3.2 Switching from HP to LP Mode
      • 3.3 Switching from LP to HP Mode
      • 3.4 Execution Correctness
      • 3.5 LP Mode Utilization
      • 3.6 Hardware Cost
    • 4 Evaluation
      • 4.1 Evaluation Environment
      • 4.2 Evaluation Results
    • 5 Related Work
    • 6 Conclusion
    • References
  • Performance-Energy Trade-off in CMPs with Per-Core DVFS
    • 1 Introduction
    • 2 Related Work
    • 3 Model Construction Methodology
      • 3.1 Contention Metrics
      • 3.2 Data Collection
      • 3.3 Building the Model
      • 3.4 Application of the Model
    • 4 Comparison of Machine Learning Algorithms
    • 5 Evaluation
      • 5.1 Evaluation Setup
      • 5.2 Analysis of the Results
    • 6 Conclusion
    • References
  • Towards Fine-Grained DVFS in Embedded Multi-core CPUs
    • 1 Introduction
    • 2 Related Works
    • 3 Fine-Grained DVFS
      • 3.1 DVFS Points Extension
      • 3.2 Overhead Characterization
    • 4 Experimental Results
      • 4.1 DVFS Points Extension
      • 4.2 Overhead Characterization
    • 5 Conclusions
    • References
  • Partial Reconfiguration
  • Evaluating Auto-adaptation Methods for Fine-Grained Adaptable Processors
    • 1 Introduction
    • 2 Approach
      • 2.1 Target Processor
      • 2.2 Proposed Auto-adapting Method
    • 3 Implementation
      • 3.1 Common
      • 3.2 Window-Based Monitoring
      • 3.3 BTCB
      • 3.4 Phase Change Annotations
    • 4 Evaluation
      • 4.1 Experimental Setup
      • 4.2 Results
    • 5 Related Work
    • 6 Conclusions
    • References
  • HLS Enabled Partially Reconfigurable Module Implementation
    • 1 Introduction
    • 2 Related Work
    • 3 Model
    • 4 Bounding Box Generation
      • 4.1 Overview
      • 4.2 Generation
    • 5 Case Study
      • 5.1 Maxeler System and Dataflow
      • 5.2 Static System
      • 5.3 Implemented Modules
      • 5.4 Mitigation Strategies
    • 6 Conclusion
    • References
  • Hardware Acceleration in Genode OS Using Dynamic Partial Reconfiguration
    • 1 Introduction
    • 2 Genode OS
      • 2.1 Microkernel Based System Policy
      • 2.2 Component Communication
    • 3 Related Work
    • 4 Reconfigurable Hardware
    • 5 Reconfiguration Software
      • 5.1 Loading Partial Bitstreams
      • 5.2 Accessing the Configuration Port
      • 5.3 Hardware Scheduler
      • 5.4 Hardware Acceleration
    • 6 Exemplary Use Case and Evaluation
    • 7 Conclusion
    • References
  • Large Scale Computing
  • Do Iterative Solvers Benefit from Approximate Computing? An Evaluation Study Considering Orthogonal Approximation Methods
    • 1 Introduction
      • 1.1 Current Status
      • 1.2 Methodology of the Evaluation
      • 1.3 Main Findings
    • 2 Mathematical Background and Data Generation
    • 3 Approximation Computing Methods
      • 3.1 Relaxed Synchronization
      • 3.2 Sampling
      • 3.3 On the Data Type Level
      • 3.4 Input Data Approximation
    • 4 Experiments
      • 4.1 Evaluation Metrics
      • 4.2 Influence of Approximate Computing on the Data Type Level
      • 4.3 Analysis of Approximate Computing Loop Strategies
      • 4.4 Accuracy Degradation Caused by Relaxed Synchronization
      • 4.5 Input Approximation
      • 4.6 Putting Everything Together
      • 4.7 Discussion
    • 5 Conclusion and Future Directions
    • References
  • A Flexible FPGA-Based Inference Architecture for Pruned Deep Neural Networks
    • 1 Introduction and Motivation
    • 2 Related Work
    • 3 Concept
    • 4 Architecture
    • 5 Experimental Results
    • 6 Conclusions
    • References
  • Author Index

Download 18,42 Mb.

Do'stlaringiz bilan baham:
1   ...   358   359   360   361   362   363   364   365   366




Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©hozir.org 2024
ma'muriyatiga murojaat qiling

kiriting | ro'yxatdan o'tish
    Bosh sahifa
юртда тантана
Боғда битган
Бугун юртда
Эшитганлар жилманглар
Эшитмадим деманглар
битган бодомлар
Yangiariq tumani
qitish marakazi
Raqamli texnologiyalar
ilishida muhokamadan
tasdiqqa tavsiya
tavsiya etilgan
iqtisodiyot kafedrasi
steiermarkischen landesregierung
asarlaringizni yuboring
o'zingizning asarlaringizni
Iltimos faqat
faqat o'zingizning
steierm rkischen
landesregierung fachabteilung
rkischen landesregierung
hamshira loyihasi
loyihasi mavsum
faolyatining oqibatlari
asosiy adabiyotlar
fakulteti ahborot
ahborot havfsizligi
havfsizligi kafedrasi
fanidan bo’yicha
fakulteti iqtisodiyot
boshqaruv fakulteti
chiqarishda boshqaruv
ishlab chiqarishda
iqtisodiyot fakultet
multiservis tarmoqlari
fanidan asosiy
Uzbek fanidan
mavzulari potok
asosidagi multiservis
'aliyyil a'ziym
billahil 'aliyyil
illaa billahil
quvvata illaa
falah' deganida
Kompyuter savodxonligi
bo’yicha mustaqil
'alal falah'
Hayya 'alal
'alas soloh
Hayya 'alas
mavsum boyicha


yuklab olish