Data

Traditionally, Data Management has focused on data persisted in organizations, usually in relational databases. Such data assets support the core business processes of the organization and form the basis for business applications. Increasingly, organizations also process ever larger volumes of data that emerge from expansive digitalization (web traffic, social media, and sensed sources). Regardless of the source and type of data, the fundamental questions and concerns of this realm remain the same: How to gather, organize, curate, and process data to help run an organization or extract actionable information to increase effectiveness. The Data/Information competency realm comprises one required area (Data and Information Management) and two elective areas (Data and Business Analytics; Data and Information Visualization).

Competency Areas

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

IS-Mockup

Competency Realms

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

Data Security

The Data Security knowledge area focuses on the protection of data at rest, during processing, and in transit. This knowledge area requires the application of mathematical and analytical algorithms to fully implement.

Knowledge Units

  • Cryptography
  • Digital Forensics
  • Data Integrity and Authentication
  • Access Control
  • Secure Communication Protocols
  • Cryptanalysis
  • Data Privacy
  • Information Storage Security

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

Design

This knowledge unit describes techniques for including security considerations throughout the design of software.

Topics

  1. Derivation of security requirements: Beginning with business, mission, or other objectives, determine what security requirements are necessary to succeed. These may also be derived, or changed, as the software evolves.
  2. Specification of security requirements: Translate the security requirements into a form that can be used (formal specification, informal specifications, specifications for testing).
  3. Software development lifecycle/Security development lifecycle: Include the following examples: waterfall model, agile development and security.
  4. Programming languages and type-safe languages: Discuss the problems that programming languages introduce, what type-safety does, and why it is important.

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

Fundamental Principles

This knowledge unit introduces the principles that underlie both design and implementation. The first five are restrictiveness principles, the next three are simplicity principles, and the rest are methodology principles.

Topics

  1. Least privilege:  Software should be given only those privileges that it needs to complete its task.
  2. Fail-safe defaults: The initial state should be to deny access unless access is explicitly required. Then, unless software is given explicit access to an object, it should be denied access to that object and the protection state of the system should remain unchanged.
  3. Complete mediation: Software should validate every access to objects to ensure that the access is allowed.
  4. Separation: Software should not grant access to a resource, or take a security-relevant action, based on a single condition.
  5. Minimize trust: Software should check all inputs and the results of all security-relevant actions.
  6. Economy of mechanism: Security features of software should be as simple as possible.
  7. Minimize common mechanism: The sharing of resources should be reduced as much as possible.
  8. Least astonishment: Security features of software, and security mechanisms it implements, should be designed so that their operation is as logical and simple as possible.
  9. Open design: Security of software, and of what that software provides, should not depend on the secrecy of its design or implementation.
  10. Layering: Organize software in layers so that modules at a given layer interact only with modules in the layers immediately above and below it. This allows you to test the software one layer at a time, using either topdown or bottom-up techniques, and reduces the access points, enforcing the principle of separation.
  11. Abstraction: Hide the internals of each layer, making only the interfaces available; this enables you to change how a layer carries out its tasks without affecting components at other layers.
  12. Modularity: Design and implement the software as a collection of co-operating components (modules); indeed, each module interface is an abstraction.
  13. Complete linkage: Tie software security design and implementation to the security specifications for that software.
  14. Design for iteration: Plan the design in such a way that it can be changed, if needed. This minimizes the effects with respect to the security of changing the design if the specifications do not match an environment that the software is used in.

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

Software Security

The Software Security knowledge area focuses on the development and use of software that reliably preserves the security properties of the information and systems it protects. The security of a system, and of the data it stores and manages, depends in large part on the security of its software. The security of software depends on how well the requirements match the needs that the software is to address, how well the software is designed, implemented, tested, and deployed and maintained. The documentation is critical for everyone to understand these considerations, and ethical considerations arise throughout the creation, deployment, use, and retirement of software. The Software Security knowledge area addresses these security issues. The knowledge units within this knowledge area are comprised of fundamental principles and practices.

Knowledge Units

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

AI-General

Given the utility of AI approaches for knowledge representation and inference, a data scientist should be aware of their range and history. A data scientist should develop a good sense of existing work in order to know where to look for possible solutions to the full range of possible problems one might encounter.

Knowledge

Tier 1:

  • History of AI
  • Reality of AI (what it is, what it does) versus perception
  • Major subfields of AI: knowledge representation, logical and probabilistic reasoning, planning, perception, natural language processing, learning, robotics (both physical and virtual)

Skills

Tier 1:

  • Explain how the origins of AI have led to the current status of AI 
  • Describe major branches of AI in order to recognize useful concepts and methods when needed in Data Science

Tier 2:

  • State what AI systems are and that they both collect and use data to implement AI as well as collect and generate data that can be used by data scientists.
  • Describe qualitatively how robots (physical or virtual), agents, and multi-agent systems collect and use data to embed, deliver, or implement artificial intelligence.
  • Describe data collected and produced by AI systems that can be useful for data science applications.

Dispositions

Tier 1:

  • Astute to, and respectful of, the fact that AI is not a new field, but rather one with a long and rich history. 

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

Artificial Intelligence

Artificial Intelligence (AI) includes the methodologies for modelling and simulating several human abilities that are widely accepted as representing intelligence.  Perceiving, representing, learning, planning, and reasoning with knowledge and evidence are key themes. 

Concepts and methods developed for building AI systems are useful in Data Science. For example, knowledge graphs such as semantic ontologies are both used and generated by data scientists. Computer vision algorithms can be used in analysis of image data; speech and natural language processing algorithms can be applied in analysis of speech or text data. Machine learning algorithms are applied extensively to extract patterns from data. Thus, a student who is well versed in AI will be able to apply those techniques in a Data Science context.

Conversely, Data Science methods are applied extensively in AI systems. Data Science students should have an understanding of AI systems and the way they work, if they plan to apply their work to AI. 

Due to their centrality in Data Science, AI competencies related to images, text, and machine learning are highlighted elsewhere. Working with images and text is in the Data Acquisition, Management and Governance KA; Machine Learning is its own KA but is also referenced extensively in the Data Mining KA. This knowledge area addresses knowledge representation, reasoning, and planning.

Scope

  • Major subfields of AI
  • Representation and reasoning
  • Planning and problem solving
  • Ethical considerations

Competencies

  • Describe major areas of AI as well as contexts in which AI methods may be applied.
  • Represent information in a logic formalism and apply relevant reasoning methods.
  • Represent information in a probabilistic formalism and apply relevant reasoning methods.
  • Be aware of the wide range of ethical considerations around AI systems, as well as mechanisms to mitigate problems.

Subdomains

  • AI-General – Tier 1, Tier 2
  • AI-Knowledge Representation and Reasoning (Logic-based models) – Tier 2, Elective
  • AI-Knowledge Representation and Reasoning (Probability-based models) – Tier 1, Tier 2, Elective
  • AI-Planning and Search Strategies – Tier2, Elective

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.

AP-Foundational Considerations

Presenting data in a suitable form is a challenging but important endeavor. For the data scientist this is fundamentally enabling them to display data in a form that is attractive to users / audiences and readily and appropriately understandable, but is also potentially of great value in providing insights and characteristics including underlying structure. Fundamentally it influences usability.

Knowledge

Tier 1:

  • Contexts for addressing the human computer interface: visualization of data, web pages, multimedia material, instructional material, the general computing environment paying attention to navigational considerations
  • Applicable theories, models, principles, guidelines, and standards for interface design and implementation
  • Different measures of effectiveness and attractiveness of an interface
  • The use of color and multimedia as well as ergonomics and web services
  • Cognitive models that influence interaction
  • The scope, advantages, and disadvantages of augmented reality
  • Software support to assist with perception regarding analysis and presentation
  • Accessibility considerations for different groupings of users including those with special needs

Skills

Tier 1:

  • Justify the adoption of a user centered approach to analysis and presentation
  • Critique how considerations of attention, perception, recognition, speech, movement affect the usability of an interface through a variety of contexts. 
  • Indicate how formal documents (theories, models, guidelines, etc.) affect the analysis and presentation of data
  • Explain the desirable impact of differently-abled users and differently aged groups (including children) on interfaces
  • Outline ways in which bias may be perceived in interfaces
  • Outline the range of software that can be employed in support of analysis and presentation
  • Demonstrate the added value and challenges of an augmented reality interface.

Dispositions

Tier 1:

  • Passionate and responsible recognition of the vital role of an interface in affecting all aspects of usability

Suggestions Accepted for consideration for the next Edition:

Please provide your suggestions about this knowledge unit. All submitted comments will be reviewed at the end of the month. Comments accepted for inclusion will be listed above.