PDC Example Exam Questions

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Parallel and Distributed Computing - Questions & Answers

(example exam questions, correct answers are underlined)

I. General questions:

1. Please select the phrases characterizing parallel computing:


a. multiple processors or computers working together on a common task,
b. each processor works on its section of the problem,
c. processors can exchange information,
d. allows solving problems that don’t fit into memory of single computer,
e. can solve problems faster than on single computer.

2. Please select the phrases characterizing distributed computing:


a. multiple processors or computers can work on separate tasks,
b. distributed algorithms often require the use of a coordinating process,
c. processors can exchange information,
d. distributed computing can be combined with parallel computing,
e. distributed computing can offer data and computation redundancy to increase reliability of
computation.

3. What are the benefits of rigid, symmetric computing architectures (like mesh, hypercube,
ring)?
a. existence of optimal distributed and parallel algorithms for different problems,
b. crystal-like organization of structure of connections between computing nodes allow
limitation of data transmission time,
c. crystal-like organization of structure of connections between computing nodes allowing
precise estimation of computation time for any given problem,
d. such architectures make distributed programs tolerant for hardware errors.

4. What are the benefits of solving problems with usage of artificial neural networks
a. one can easily understand the reason of the results given by the neural model,
b. it is possible to automatically prepare solutions for many problems with one training
algorithm,
c. there is a guarantee that results will be of the same precision as when using proper
classical algorithm,
d. failure of some of the connections or computing nodes (neurons) can have small impact on
accuracy of results.

5. Which of the problems can be solved in distributed computing system with high speed-up
a. summation of many numbers,
b. analysis of many graphical files,
c. reading many numbers from sequential access storage device,
d. writing many numbers to parallel access storage device,
e. performing ray-tracing algorithm for given scene.
6. Which of the mentioned systems does not perform distributed and parallel processing of
information:
a. Prometheus supercomputer at AGH Cyfronet in Cracov,
b. human brain,
c. cells of human arm,
d. smart sensors of IoT system,
e. graphical multi-processor (GPU),
f. 5-qbit quantum computer,
g. galaxy,
h. magnifying glass.

7. Select the characteristics of computation related with data parallelism


a. same operations are performed on different subsets of same data,
b. synchronous computation,
c. speedup is more as there is only one execution thread operating on all sets of data,
d. amount of parallelization is proportional to the input data size,
e. can offer optimum load balance on multi processor system.

8. Select the characteristics of computation related with task parallelism


a. different operations are performed on the same or different data,
b. asynchronous computation,
c. speedup is less as each processor will execute a different thread or process on the same or
different set of data,
d. amount of parallelization is proportional to the number of independent tasks to be
performed,
e. load balancing depends on the availability of the hardware and scheduling algorithms like
static and dynamic scheduling.

II. Questions related to HPC and parallel & distributed programming:

9. What are the most important factors determining speed-up of distributed execution of
parallel algorithm
a. computation time at the computing nodes,
b. communications time between computing nodes,
c. number of computing nodes,
d. energy cost of computation.

10. What are the most important factors determining possible time of continuous
supercomputer operation?
a. cost of energy,
b. mean time between failures (MTBF),
c. number of elements of the supercomputer,
d. patience of the operator,
e. level of redundancy of the mass storage devices.

11. Having 2D mesh architecture with n2 computing nodes the speedup of computation in
relation to single node:
a. equals n,
b. is lower or equal n2,
c. depends on the problem and processing algorithm,
d. is between 2 and n.
12. How many direct neighbors has each computing node in 4D hypercube architecture?
a. 8,
b. 16,
c. 4,
d. it can’t be told without knowing the total number of nodes.

13. How many computing nodes has 5D hypercube architecture?


a. 128,
b. 8,
c. 32,
d. 16.

14. What are the benefits of usage of frameworks such as MPI?


a. they create virtualization layer allowing constructing different logical distributed,
architectures of computing nodes on any given physical network of computers
b. they offer a set of communication functions usable for parallel programming,
c. they include set of functions solving typical computational problems in distributed manner,
like distributed matrix multiplication, distributed sorting with Fox algorithm, etc.,
d. they offer a set of synchronization mechanisms usable for parallel programming.

15. Consider the following program and select correct statements below:
1. e=a+b
2. f=c+d
3. m=e*f
a. operation 3 depends upon results of operations 1 and 2, so it cannot be calculated until
both of them are completed,
b. operations 1 and 2 do not depend on any other operation, so they can be calculated
simultaneously,
c. operations 1 and 2 depend on each other, so they cannot be calculated simultaneously,
d. if we assume that each operation can be completed in one unit of time then these three
instructions can be completed in a total two units of time,
e. if we assume that each operation can be completed in one unit of time then these three
instructions can be completed in a total one unit of time.

16. During parallelization the main goal of loop transformation is to:


a. maximize the degree of parallelism in loop nest,
b. maximize data locality in loop nest,
c. support efficient usage of memory hierarchy on a parallel machine,
d. increase task parallelism.

17. The typical loop transformations during code parallelization are:


a. permutation,
b. deletion,
c. reversal,
d. skewing,
e. grinding,
f. unrolling,
g. unswitching.
18. Select languages/compilers that support automatic code parallelization:
a. Java 8+,
b. Haskell,
c. Polaris Fortran compiler,
d. C#,
e. Intel C++,
f. Rubus,
g. VPascal.

19. Select constructs that support code parallelization:


a. parallel streams,
b. immutable objects,
c. completable futures,
d. monads,
e. transactional memory,
f. mutexes,
g. data races.

20. Select programming tools that support code parallelization:


a. dedicated keywords (“volatile”, “synchronized”),
b. detection of data parallelism,
c. atomic operations,
d. asynchronous exceptions,
e. threads and threads pools,
f. semaphores,
g. deadlocks.

III. Questions related to GPU and CUDA:

21. What is the factor most limiting efficiency of GPGPU processors?


a. number of internal computing cores,
b. speed of CPU-GPU memory transfers,
c. energy consumption of computing cores,
d. amount of memory available to GPU cores.

22. Please select the phrases characterizing GPU:


a. streaming multiprocessor (SM) includes many computing cores,
b. each computing core in SM executes identical instruction set, or sleeps,
c. computing cores in SM can use shared memory,
d. a kernel scales across any number of parallel processors under assumption that each block
can be assigned to any processor at any time,
e. computing cores can access GPU global memory,
f. host CPU can access global memory of GPU.

23. Please select the limitations of GPU (device):


a. no recursion in device code,
b. no functions pointers in device code,
c. memory latency is not hidden by large cache,
d. GPU needs many active threads to hide memory latency,
e. only host CPU can modify constant memory of the blocks’ grid.
24. Please select the phrases characterizing memory access in CUDA:
a. writing to variable in shared memory by multiple threads need synchronization for all
threads in a block,
b. atomic operations on global memory (e.g. atomicAdd()) are not cheap due to the need of
serialized write access to a memory cell,
c. atomic operations are much faster for shared variables than for global ones,
d. not coalesced writes to GPU global memory are more than 10 times slower than coalesced

25. Please select correct statements related to CUDA Cooperative Thread Array
a. CUDA thread block (CTA) is an array of concurrent threads that cooperate to compute a
result,
b. CTA threads have thread id numbers,
c. CTA threads share data and synchronize,
d. thread program within CTA uses thread id to select work and address shared data,
e. CTA size and dimensionality is declared by the programmer.

26. Please select statements describing how CUDA programmer partitions a problem with Data-
Parallel decomposition:
a. programmer partitions problem into Grids, one Grid per sequential problem step,
b. programmer partitions Grid into result Blocks computed independently in parallel,
c. programmer assumes that GPU thread array computes result Block,
d. programmer partitions Block into elements computed cooperatively in parallel,
e. programmer assumes that GPU thread computes result elements.

27. Is it true that within integrated CPU+GPU application C program:


a. problem is partitioned into a sequence of kernels,
b. kernel C code executes on GPU,
c. serial C code executes on CPU,
d. kernels execute as blocks of parallel threads,
f. can use application data parallelism and thread parallelism.

28. Please select the phrases characterizing CUDA:


a. automatic variables without any type qualifier reside in registers if they are available,
b. arrays reside in local memory,
c. pointers can only point to global memory,
d. pointers chains (linked lists) should not be used,
e. no communication among Blocks of same Grid,
f. program does not know how many cores it uses.

29. Is it true that within GPU:


a. thread block is a set of threads running on one multiprocessor,
b. all threads of a single thread block can communicate with each other through shared
memory,
c. threads of a single thread block can be synchronized,
d. blocks of threads are divided into warps,
e. warps are groups of threads that are scheduled together for execution,
f. small number of threads per block cause load latency in device memory reads,
g. assigning one block per multiprocessor makes the multiprocessor idle during thread
synchronization.
30. Typical GPU operation within Streaming Multiprocessor can be expressed as:
a. SISD – Single Instruction Single Data,
b. MISD – Multiple Instructions Single Data,
c. SIMD – Single Instruction Multiple Data,
d. MIMD – Multiple Instructions Multiple Data,
e. SPMD – Single Program Multiple Data,
f. SIMT – Single Instruction Stream Multiple Threads

31. Typical GPU operation can be expressed as:


a. SISD – Single Instruction Single Data,
b. MISD – Multiple Instructions Single Data,
c. SIMD – Single Instruction Multiple Data,
d. MIMD – Multiple Instructions Multiple Data,
e. SPMD – Single Program Multiple Data

IV. Questions related to IoT:

32. What are the places/applications in which one can expect most useful usage of IoT systems?
a. factories,
b. nursing homes,
c. smart cities,
d. house kitchen,
e. mass surveillance systems,
f. environment monitoring.

33. What are the factors that strongly limit the operation time of battery-operated IoT sensors?
a. average energy consumption,
b. frequency of e-m waves used for radio transmissions,
c. frequency of radio transmissions,
d. mass of the sensor,
e. energy capacity of the battery,
f. number of IoT sensors in near proximity.

34. What are the actual problems with end-consumer IoT systems?
a. short lifetime due to limited product support,
b. short battery-life,
c. complicated configuration,
d. limited standardization of protocols (war of standards between companies),
e. forced high dependency of local IoT system on manufacturer cloud system,
f. short and limited tests of the product by the manufacturer,
g. lack of standards and very limited guarantees related to data and consumer health safety.

V. Questions related to artificial neural networks:

35. Feedforward neural networks are:


a. built of simple computing units called neurons,
b. universal approximators,
c. processing signals from inputs to outputs,
d. including one or many fully connected layers of neurons,
e. called black-boxes.
36. Error backpropagation algorithm:
a. can be used to train artificial neural networks,
b. is susceptible to exploding gradient problem,
c. is susceptible to vanishing gradient problem (death of the neurons),
d. is unstable (results highly depend on initial values of weights),
e. is local in nature (tends to explore the nearest local optimum),

37. Stochastic Gradient Descent means:


a. update of connections weights after presentation of all training vectors,
b. update of connections weights after presentation of each training vector,
c. update of connections weights after presentation of few training vectors,
d. that training is impossible due to random nature of training data.

38. Mini-batch training of neural networks:


a. helps to speed up training with the use of GPUs,
b. updates connections weights after presentation of few training vectors,
c. smoothes the changes of output errors of NN during the training,
d. increases the number of connections weights updates per given number of processed
training vectors.

39. Typical simulation and training of NN in GPUs makes use of:


a. matrix representation of NNs,
b. matrix related operations (e.g. multiplication, convolution, max pooling),
c. loading as much training data to GPU memory as possible,
d. NN models with only few (or none) fully connected layers of neurons,
e. linear rectifier activation functions to speed up training and limit death of neurons,
f. CPU to help calculating values of activation functions.

40. Typical practical applications of deep learning of Convolutional NNs with the use of GPUs are:
a. speech recognition,
b. scene understanding,
c. autonomous vehicles,
d. images and video up-scaling,
e. winning game of GO with human master,
f. testing GPU cores and memory for errors.

41. Distributed and parallel training of artificial neural networks can be done with usage of:
a. map-reduce version of error-backpropagation algorithm (as e.g. in h2o framework),
b. parameters selection by evolutionary algorithm (e.g. genetic algorithm),
c. mini-batch training ,
d. matrix operations on massively parallel multiprocessors inside GPUs,
e. many computers connected to the Internet,
f. parallel port Covox DAC device.

42. Problems and limitations of artificial neural networks:


a. long training times for deep architectures with very big numbers of neuron layers,
b. hard to verify if neural model behaves correctly for all input vectors,
c. selection of model architecture based on experience and experiments,
d. susceptible to adversarial attacks with usage of noisy data,
e. can generate false data that look trustworthy,
f. quickly degenerate with number of processed data vectors.
VI. Questions related to ensembles of classifiers:

43. Ensembles of classifiers are:


a. created from many base classifiers,
b. using voting to increase probability of correct decision,
c. better when errors made by base classifiers are mutually independent,
d. easier to construct for problems with greater number of considered classes,
e. better when errors made by base classifiers are mutually dependent.

44. Ensembles of classifiers are:


a. easy to parallelize due to distributed nature of set of base classifiers and of processed data,
b. made of base classifiers of single or many types (hybrid, e.g. neural networks and decision
trees),
c. models of groups of humans voting in large decisive bodies like senate,
d. hard to parallelize due to centralized nature of voting,
e. using voting to decrease mutual dependence between base classifiers.

VII. Questions related to peer-to-peer networks:

45. Peer-to-peer can be defined as:


a. theory behind data sharing protocols,
b. a large-scale distributed system that operates without a central server bottleneck,
c. a set of protocols used in systems such as Gnutella, Youtube and BitTorrent,
d. connection between two computers in Internet

46. Select peer-to-peer architectures:


a. client-server (with single central server, e.g. Napster),
b. unstructured (without single entity having global map of the system, e.g. Gnutella),
c. structured (e.g. hypercubes, fat trees, rings, grids, tori; e.g. Kademlia & eMule)
d. hybrid of structured and unstructured (reliable/powerful peers are promoted to
superpeers/trackers; superpeers create layer of servers handling search requests, peers are
unstructured, e.g. Youtube, Google search),
e. butterfly, CCC (cube-connected-cycles)

47. Typical problems existing in peer-to-peer network:


a. high level of churn (peers are randomly joining and leaving the network),
b. discovery of peers (especially when peer joins the network),
c. organizing distributed hash table,
d. providing search efficiency and security (indexing, replication of crucial data),
e. preventing denial of service attacks (obligatory message delays),
f. keeping privacy of peers,
g. possibility of high transfer rates in fat branches of structured networks,
h. processing non-exact queries.

48. Please select examples of peer-to-peer overlay networks:


a. Distributed Naming System (DNS),
b. Network Time Protocol (NTP),
c. BitTorrent network,
d. Blockchain network,
e. Tor onion network,
f. Skype for business
49. Blockchain system is realization of:
a. one openly available single ledger for all parties, produced, replicated and authenticated
collaboratively,
b. openly available separate ledger for each of parties, produced, replicated and
authenticated collaboratively,
c. openly available separate ledger for each of parties, each produced and authenticated by
given party,
d. separate, secret ledger for each of parties, each produced by given party.

50. Please select main properties and costs of blockchain systems:


a. transaction transparency for the cost of high system complexity,
b. anonymity for the cost of low compliance with legal systems,
c. trust for the price of high environmental cost,
d. low cost of transaction for the cost of low speed of transaction,
e. freedom for the cost of low interoperability of blockchain systems,
f. high market value for the cost of low scalability.

===EOT===

You might also like