Grid, Cluster and Cloud Computing

The purpose of this lecture is to provide an introduction into the theoretical aspects and basic applications of grid, cluster and cloud computing technologies, with a focus on cloud computing. We will study the fundamental theoretical basis aspects as well as new and emergent topics related to grid operating systems, cloud based systems, cloud systems architectures and services.

Team ID =>joing the class with the following code: 0cv5g2j

Labs and other communication will happen on Teams as well - in the same class.

27/03/2024 16:00-20:00 (C335)- IaaS, Private Cloud and Virtualization - Invited Speaker Tudor Damian (Microsoft MVP and ethical hacker) - presence compulsory

20/03/2024 - 16:00-20:00 (C335)- Azure Architecture and Automation Seminar - Invited Speaker Florin Loghiade Cloud Solutions Architect -presence is compulsory

Platforma Microsoft Azure si aplicatii Cloud- Invited Speaker Radu Vunvulea

Azure overview    Windows Azure Compute    Windows Azure Storage

 

Content

1.     Introduction to cluster computing definitions, roles, Taxonomies

2.     Distributed processing

3.     Hardware, Architectures, Cluster Technologies

4.     Distributed File Systems

5.     Virtualization technologies

6.     Grid and Cloud Processing

7.     Grid and Cloud System Architectures

8.     Implementation methods for Application partitioning and planning

9.     Functional and parallel programming

10. Map  reduce Paradigm

11. Web Services and Computing Services

12. Microsoft Azure/Amazon AWS

13. Cloud based Data management systems

14. Final Overview

Bibliography:

1.      Foster, Ian; Carl Kesselman (1999). The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers. ISBN 1-55860-475-8

2.      Li, Maozhen; Mark A. Baker (2005). The Grid: Core Technologies. Wiley. ISBN 0-470-09417-6

3.      G. Reese, Cloud Application Architectures: Building Applications and Infrastructure in the Cloud, O'Reilly, 2009, ISBN:978-0-596-15636-7

4.      Tanenbaum A.S. Operating Systems Design and Implementation (Third Edition). Prentice 2006.

5.      J.F Kurose, K. Ross, Computer Networking - A top down approach, Addisson Wesley, 2007 (4th ed)

6.      Anil Desay, The Definitive Guide to Virtual Platform Management, 2010, Ca technologies, download http://nexus.realtimepublishers.com/dgvpm.php

7.      R. Jennings, Cloud Computing with the Windows Azure Platform (Wrox Programmer to Programmer),  Wrox, 2009, ISBN: 978-0470506387

8.      D. Sanderson, Programming Google App Engine Build and Run Scalable Web Apps on Google's Infrastructure, O'Reilly, 2009., ISBN:978-0-596-52272-8

9.      Andy Oram (ed), Peer-to-peer Harnessing the power of disruptive technologies, O'Reilly, 2001, ISBN: 978-0596001100

10.  * * *, http://code.google.com/intl/ro-RO/appengine/docs/

 

1.     Labs/Seminars

1.      Sem 1, 2,3 - The Hadoop  Infrastructure-installation,configuration, application running

·         Virtual Machine installation and Configuration

·         Hadoop Installation ->VM

·         Configure master nodes and compute nodes /DFS

·         Replication configuration and cluster management

·         Docker deployment

An example of a Virtual Machine can be downloaded from here in the Teams Files folder of the class. An online copy (for remote download) of the same virtual machine (user root, pass: test) is also found here.

The docker variant (that should be developed further by students) is available here: https://github.com/asergiu/hadoop-docker

Once this lab is finished you should have eclipse (optionally) configured with the Hadoop plugin together with the virtual machine and be able to connect from eclipse to the VM in order to deploy jobs or just to build your apps and run the on the cluster. You should also be able to copy files to and from the HDFS filesystem

2.      Sem 4 - Simple Hadoop application implementation (map reduce hello World = word count

3.      Sem 5,6 - implementing an Azure distributed app using at least 3 services (for ex: Compute/Storage/Web Roles and Worker Roles or something else). There is an in-class example

4.      Sem 7 Semester Project presentation - Hadoop - inversed index.

Hadoop Project: Using the already configured Hadoop VM (either VM or docker version) from above, together with the development environment write an inversed index app for a set of books as data. (you can download text books from project gutenberg – using text format). In order to build the inversed index you need to account for a stop word list (words that will not be indexed by the application. Ex: and, or, how, so, etc). These stop words will be read by the application from a text file (stopwords.txt)

An inversed index contains for each distinct unique word, a list of files containing the given word with its location within the file (line number in our case). When running the application you should have a small cluster/cloud of at least two nodes build from VMs/docker – eventually a larger cluster build from all your individual VMs.

Ex: word: (file#1, line#1, line#2, …) (file#4, line#1, line#2,…) …)

 

Bibliography and references – material (only campus access -copy available on teams)

 

Example: Wordcount java

Azure Project 1 – Any project using at least 3 services from the Azure platform – you can come up with a problems yourself.

Example of Requirements for Project (you do not need to implement this one):  Implement a cloud web application that accepts guest reviews (text) with images posted. As users post reviews with images on the webpage – they are displayed in a chronological reversed order (with the more recent at the top of the page). The app should keep all posts and have them displayed at all times – while allowing guests to publish new posts. As guests could upload images of large different sizes – the guestbook app should handle image resizing (automatically) so that the image from each post is automatically scaled down to a standard small definition (around 128 pixels) as an icon. The original image is replaced in each post by the small resized icon of it. The original image is reachable by clicking the small icon in every post. This ensures that the guestbook page would not take too long to download and display in user browsers due to the large images from posts. You probably need to use: a table store for storing messages and link, a blob storage for storing images and thumbnails( icons), a web role for implementing the web page and a worker role that reads a queue storage having as entries large images to be scaled down. As the worker roles progresses trough posts – their images are transformed to thumbnails and the web page updated to reflect that.

 

Presentation Subjects

Examination

Written Exam (or Presentation) 50% + Hadoop Project 25% + Azure Project 25% = Final Grade

Students need :

·        Min 5 on the Presentation topic/Written Exam (you choose one of the two)

·        Min 5 for the projects