Operating Systems for Parallel and Distributed Architectures / Operating Systems and Computer Architecture

Information regarding the discipline

Name of the discipline:
Operating Systems for Parallel and Distributed Architectures (High Performance Computing and Big Data Analytics Master’s programme)
Operating Systems and Computer Architecture (Artificial Intelligence for Connected Industries Master’s programme)

Course coordinator: Assoc. prof. Darius Bufnea, darius.bufnea at ubbcluj punct ro

Prerequisites

Curriculum: Operating Systems, Distributed Operating Systems, Computer Networks
Competencies: Average administration and programming skills

Objectives of the discipline

General objective of the discipline: Know the key concepts of parallel cluster architectures
Specific objective of the discipline: At the end of the course, students will know how to build, deploy, configure, maintain, monitor, debug a Linux parallel cluster.

Content

  1. Introduction to Operating systems for parallel architectures
  2. Parallel Cluster architecture: Cluster Head Nodes, Computer Nodes, Clustering Middleware
  3. Parallel Cluster Paradigms: Single system image, Centralized system management, High processing capacity, Resource consolidation, Optimal use of resources, High-availability, Redundancy, Single points of failure, Failover protection and disaster recovery, Horizontal and vertical scalability, Load-balancing, Elasticity, Run jobs anytime, anywhere
  4. Design and configuration. Network prerequisites for a parallel cluster: LAN, bandwidth, latency, interface, security aspects. Nodes automatic configuration and deployment
  5. Virtualization of hardware, operating system, storage devices, computer network resources
  6. Beowulf clusters deployment and administrations
  7. Linux Cluster Distributions: Mosix, ClusterKnoppix. Automated operating systems and software provisioning for a Linux Cluster: Open Source Cluster Application Resources (OSCAR)
  8. Cluster resources: distributed memory architecture and distributed shared memory, distributed file systems (examples: IBM General Parallel File System, Microsoft’s Cluster Shared Volumes, Oracle Cluster File System
  9. Nodes and head node management, Cluster system management, Debugging and monitoring a parallel cluster, Node failure management
  10. Data sharing and communication, Message passing and communication, Parallel processing libraries: Parallel Virtual Machine toolkit and the Message Passing Interface library
  11. Software and development environment, Parallel application development and execution (Parallel Environment – PE), Job scheduling & management

Bibliography

  1. Gregory Pfister: In Search of Clusters, Prentice Hall; 2nd edition (December 22, 1997), ISBN-10: 0138997098, ISBN-13: 978-0138997090;
  2. George F. Coulouris, Jean Dollimore, Tim Kindberg: Distributed Systems: Concepts and Design, Addison-Wesley; 5th edition (May 7, 2011), ISBN-10: 0132143011, ISBN-13: 978-0132143011;
  3. Joseph D. Sloan: High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI, O’Reilly Media (November 23, 2004), ISBN-10: 0596005709, ISBN-13: 978-0596005702;
  4. Daniel F. Savarese, Donald J. Becker, John Salmon, Thomas Sterling: How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters, The MIT Press (May 28, 1999), ISBN-10: 026269218X, ISBN-13: 978-0262692182;
  5. Gordon Bell, Thomas Sterling: Beowulf Cluster Computing with Linux, The MIT Press; 1st edition (October 1, 2001), ISBN-10: 0262692740, ISBN-13: 978-0262692748;
  6. Charles Bookman: Linux Clustering: Building and Maintaining Linux Clusters, Sams Publishing; 1st edition (June 29, 2002), ISBN-10: 1578702747, ISBN-13: 978-1578702749.

Evaluation

Type of activity Evaluation criteria Evaluation methods Share in the grade (%)
Course Know the key theoretical concepts of parallel cluster architectures Written exam 30%
Seminar/lab activities Know how to deploy, maintain, debug and monitor a parallel cluster Homework assignments 30%
Presentation on clustering related topics 30%
Default 10%