SF – SIGCSE 2022 Version

Systems Fundamentals (SF)

A computer system is a set of hardware and software infrastructure upon which applications are constructed. Computer systems have become a pillar of people’s daily life. As such, learning the knowledge about computer systems, grasping the skills to use and design these systems, and understanding the fundamental rationale and principles in computer systems are essential to equip students with the necessary competency toward a career related to computer science.

In the curriculum of computer science, the study of computer systems typically spans across multiple courses, including, but not limited to, operating systems, parallel and distributed systems, communications networks, computer architecture and organization and software engineering. The System Fundamentals knowledge area, as suggested by its name, focuses on the fundamental concepts in computer systems that are shared by these courses within their respective cores. The goal of this knowledge area is to present an integrative view of these fundamental concepts in a unified albeit simplified fashion, providing a common foundation for the different specialized mechanisms and policies appropriate to the particular domain area. These concepts include an overview of computer systems, basic concepts such as state and state transition, parallelism, resource allocation and scheduling, and so on. Compared to CS2013, the SF knowledge area makes the following major changes to the knowledge units:

Added two new units: system security and system design;
Added a new unit of system performance, which includes the topics from the deprecated unit of proximity and the deprecated unit of virtualization and isolation;
Added a new unit of performance evaluation, which includes the topics from the deprecated unit of evaluation and the deprecated unit of quantitative evaluation;
Changed the unit of computational paradigms to overview of computer systems, deprecated some topics in the unit, and added topics from the deprecated unit of cross-layer communications;
Changed the unit of state and state transition to basic concepts, and added topics such as finite state machines;
Changed some topics in the unit of parallelism, such as simple application-level parallel processing;
Deprecated the unit of cross-layer communications, and moved parts of its topics to the unit of overview of computer systems;
Deprecated the units of evaluation and quantitative evaluation, and moved parts of their topics to the unit of performance evaluation;
Deprecated the units of proximity and virtualization and isolation, and moved parts of their topics to the unit of system performance;
Renamed the unit of reliability through redundancy to system reliability.

SF. Systems Fundamentals. [18 Core-Tier1 hours, 9 Core-Tier2 hours]

	Core-Tier1 hours	Core-Tier2 hours	Includes Electives
SF/Overview of Computer Systems	3		N
SF/Basic Concepts	4		N
SF/Parallelism	3		N
SF/Resource Allocation and Scheduling	1	2	N
SF/System Performance	2	2	N
SF/Performance Evaluation	2	2	N
SF/System Reliability	1	1	N
SF/System Security	1	1	Y
SF/System Design	1	1	Y

SF/Overview of Computer Systems

[3 Core-Tier1 hours]

Topics:

Basic building blocks and components of a computer (gates, flip-flops, registers, interconnections; datapath + control + memory)
Hardware as a computational paradigm: Fundamental logic building blocks; Logic expressions, minimization, sum of product forms
Programming abstractions, interfaces, use of libraries
Distinction between application and OS services, remote procedure call
Application-OS interaction
Basic concept of pipelining, overlapped processing stages
Basic concept of scaling: going faster vs. handling larger problems

Learning Outcomes:

Describe the basic building blocks of computers and their role in the historical development of computer architecture.
Design a simple logic circuit using the fundamental building blocks of logic design to solve a simple problem (e.g., adder).
Use tools for capture, synthesis, and simulation to evaluate a logic circuit design.
Describe how computing systems are constructed of layers upon layers, based on separation of concerns, with well-defined interfaces, hiding details of low layers from the higher layers.
Describe that hardware, OS, VM, application are additional layers of interpretation/processing.
Describe the mechanisms of how errors are detected, signaled back, and handled through the layers.
Construct a simple program (e.g., a TCP client/server) using methods of layering, error detection and recovery, and reflection of error status across layers.
Find bugs in a layered program by using tools for program tracing, single stepping, and debugging.
Understand the concept of strong vs. weak scaling, i.e., how performance is affected by scale of problem vs. scale of resources to solve the problem. This can be motivated by the simple, real-world examples.

SF/Basic Concepts

[4 Core-Tier1 hours]

Topics:

Digital vs. Analog/Discrete vs. Continuous Systems
Simple logic gates, logical expressions, Boolean logic simplification
Clocks, State, Sequencing
State and state transition (e.g., starting state, final state, life cycle of states)
Finite state machines (e.g., NFA, DFA)
Combinational Logic, Sequential Logic, Registers, Memories
Computers and Network Protocols as examples of State Machines

Learning Outcomes:

Describe the differences between digital and analog systems, and between discrete and continuous systems. Can give real-world examples of these systems.
Describe computations as a system characterized by a known set of configurations with transitions from one unique configuration (state) to another (state).
Describe the distinction between systems whose output is only a function of their input (stateless) and those with memory/history (stateful).
Develop state machine descriptions for simple problem statement solutions (e.g., traffic light sequencing, pattern recognizers).
Describe a computer as a state machine that interprets machine instructions.
Explain how a program or network protocol can also be expressed as a state machine, and that alternative representations for the same computation can exist.
Derive time-series behavior of a state machine from its state machine representation (e.g., TCP connection management state machine).

SF/Parallelism

[3 Core-Tier1 hours]

[Cross-reference: PD/Parallelism Fundamentals]

Topics:

Sequential vs. parallel processing
Application-level sequential processing: single thread
Simple application-level parallel processing: request level (web services/client-server/distributed), single thread per server, multiple threads with multiple servers, pipelining
Parallel programming vs. concurrent programming
Request parallelism vs. Task parallelism
Multicore architectures and hardware support for synchronization

Learning Outcomes:

Write a simple sequential problem and a simple parallel version of the same program.
Evaluate the performance of simple sequential and parallel versions of a program with different problem sizes, and be able to describe the speed-ups achieved.
Demonstrate on an execution timeline that parallelism events and operations can take place simultaneously (i.e., at the same time). Explain how work can be performed in less elapsed time if this can be exploited.
Explain other uses of parallelism, such as for reliability/redundancy of execution.
Describe the differences between the concepts of instruction parallelism, data parallelism, thread parallelism/multitasking, task/request parallelism.
Write one simple parallel program (e.g., PageRank) in more than one parallel programming paradigm (e.g., a simple parallel program that manages shared resources through synchronization primitives; and a simple parallel program that performs simultaneous operation on partitioned data through task parallel).
Use performance tools to measure speed-up achieved by parallel programs in terms of both problem size and number of resources.

SF/Resource Allocation and Scheduling

[1 Core-Tier1 hour and 2 Core-Tier2 hours]

Topics:

Different types of resources (e.g., processor share, memory, disk, net bandwidth)
Common scheduling algorithms (e.g., first-come-first-serve scheduling, priority-based scheduling, fair scheduling and preemptive scheduling )
Advantages and disadvantages of common scheduling algorithms

Learning Outcomes:

Define how finite computer resources (e.g., processor share, memory, storage and network bandwidth) are managed by their careful allocation to existing entities.
Describe how common scheduling algorithms work.
Describe the pros and cons of common scheduling algorithms
Implement common scheduling algorithms, and evaluate their performances.

SF/System Performance

[2 Core-Tier1 hours and 2 Core-Tier2 hours]

[Cross-reference: AR/Memory Management, OS/Virtual Memory]

Topics:

Latencies in computer systems
- Speed of light and computers (one foot per nanosecond vs. one GHz clocks)
- Memory vs. disk latencies vs. across the network memory
Caches and the effects of spatial and temporal locality on performance in processors and systems
Caches and cache coherency in databases, operating systems, distributed systems, and computer architecture
Introduction into the processor memory hierarchy and the formula for average memory access time
Rationale of virtualization and isolation: protection and predictable performance
Levels of indirection, illustrated by virtual memory for managing physical memory resources
Methods for implementing virtual memory and virtual machines

Learning Outcomes:

Explain the importance of locality in determining system performance.
Calculate average memory access time and describe the tradeoffs in memory hierarchy performance in terms of capacity, miss/hit rate, and access time.
Explain why it is important to isolate and protect the execution of individual programs and environments that share common underlying resources.
Describe how the concept of indirection can create the illusion of a dedicated machine and its resources even when physically shared among multiple programs and environments.
Measure the performance of two application instances running on separate virtual machines, and determine the effect of performance isolation.

SF/Performance Evaluation

[2 Core-Tier1 hours and 2 Core-Tier2 hours]

Topics:

Performance figures of merit
Workloads and representative benchmarks, and methods of collecting and analyzing performance figures of merit
CPI (Cycles per Instruction) equation as tool for understanding tradeoffs in the design of instruction sets, processor pipelines, and memory system organizations.
Amdahl’s Law: the part of the computation that cannot be sped up limits the effect of the parts that can
Analytical tools to guide quantitative evaluation
Order of magnitude analysis (Big O notation)
Analysis of slow and fast paths of a system
Events on their effect on performance (e.g., instruction stalls, cache misses, page faults)
Understanding layered systems, workloads, and platforms, their implications for performance, and the challenges they represent for evaluation
Microbenchmarking pitfalls

Learning Outcomes:

Explain how the components of system architecture contribute to improving its performance.
Explain the circumstances in which a given figure of system performance metric is useful.
Explain the usage and inadequacies of benchmarks as a measure of system performance.
Describe Amdahl’s law and discuss its limitations.
Use limit studies or simple calculations to produce order-of-magnitude estimates for a given performance metric in a given context.
Use software tools to profile and measure program performance.
Design and conduct a performance-oriented experiment of a common system (e.g., an OS and Spark).
Conduct a performance experiment on a layered system to determine the effect of a system parameter on figure of system performance.

SF/System Reliability

[1 Core-Tier1 hours and 1 Core-Tier2 hours]

Topics:

Distinction between bugs and faults
Reliability through redundancy: check and retry
Reliability through redundancy: redundant encoding (error correcting codes, CRC, FEC)
Reliability through redundancy: duplication/mirroring/replicas
Other approaches to reliability

Learning Outcomes:

Explain the distinction between program errors, system errors, and hardware faults (e.g., bad memory) and exceptions (e.g., attempt to divide by zero).
Articulate the distinction between detecting, handling, and recovering from faults, and the methods for their implementation.
Describe the role of error correcting codes in providing error checking and correction techniques in memories, storage, and networks.
Apply simple algorithms for exploiting redundant information for the purposes of data correction.
Compare different error detection and correction methods for their data overhead, implementation complexity, and relative execution time for encoding, detecting, and correcting errors.

SF/System Security

[1 Core-Tier1 hours and 1 Core-Tier2 hours]

Topics:

Common system security issues (e.g., virus, denial-of-service attack and eavesdropping)
Countermeasures
- Cryptography
- Security architecture
- Intrusion detection systems, firewalls

Learning Outcomes:

Describe some common system security issues and give examples.
Describe some countermeasures against system security issues.

SF/System Design

[1 Core-Tier1 hours and 1 Core-Tier2 hours]

Topics:

Common criteria of system design (e.g., liveness, safety, robustness, scalablility and security)
Designs of representative systems (e.g., Apache web server, Spark and Linux)

Learning Outcomes:

Describe common criteria of system design.
Given the functionality requirements of a system and its key design criterias, provide a high-level design of this system.
Describe the design of some representative systems.