Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
147 Cards in this Set
- Front
- Back
In-Memory Databases
|
Store data in main memory rather than secondary storage. Take advantage of parallel computing and multicore processors to deliver faster business intelligence
|
|
Database
|
1. Specialized structure that allows computer-based systems to store, manage, and retrieve data very quickly
2. A shared, integrated computer structure that stores a collection of End-User data and Metadata. |
|
Data
|
Raw facts that have not yet been processed to reveal their meaning. The foundation of information.
|
|
Information
|
the result of processing raw data to reveal its meaning.
Can be used as the foundation for decision making. |
|
Data Processing
|
Can be as simple as organizing data to reveal patterns or as complex as making forecasts or drawing inferences using statistical modeling.
|
|
Context
|
reveals meaning, information
|
|
We are now said to be entering the _________age.
|
Knowledge
|
|
Timely and useful information requires ____________ data.
|
Accurate
|
|
Data Management
|
A Discipline that focuses on the proper generation, storage and retrieval of data.
Core Activity for any business, government agency, service organization, or charity. |
|
End-user Data
|
Raw facts of interest to the end user.
|
|
Metadata
|
Data about data, through which the end-user data are integrated and managed. Describe the data characteristics and the set of relationships that link the data found within the database.
Provides information that complements and expands the value and use of the data. Present a more complete picture of the data in the database. |
|
Database Management System (DBMS)
|
A collection of programs that manages the database structure and controls access to the data stored in the database.
Serves as the intemediary between the user and the database. |
|
The ____________ program might be written by a programmer using a programming language like Java or C#, or it might be created through a DBMS utility program.
|
Application
|
|
DBMS Advantages
|
* Improved Data Sharing
* Improved Data Security * Better Data Integration * Minimized Data Inconsistency *Improved Data Access * Improved Decision Making * Increased end-user productivity |
|
Centralized Database
|
Database that supports data located at a single site.
|
|
Distributed Database
|
Database that supports data distributed across several different sites
|
|
Data inconsistency
|
exists when different versions of the same data appear in different places
|
|
Query
|
A specific request issued to the DBMS for data manipulation
|
|
Ad Hoc Query
|
A Spur-of-the-moment question
|
|
Query Result Set
|
The DBMS sends back an answer/response to the application.
|
|
Data Quality
|
A comprehensive approach to promoting the accuracy, validity, and timeliness of the data.
|
|
Single-User Database
|
Supports only one user at a time.
|
|
Desktop Database
|
Single-user database that runs on a PC
|
|
Multiuser Database
|
Supports mutiple users at the same time.
|
|
Workgroup Database
|
multiuser database that supports a relatively small number of users - usually fewer than 50, or a specific department within an organization
|
|
Enterprise Database
|
Multiuser database used by the entire organization and supports many users - more than 50, usually hundreds.
|
|
General-purpose Database
|
Contains a wide variety of data used in multiple disciplines
|
|
Discipline-specific Database
|
Contains data focused on specific subject areas. Mainly used for academic or research purposes within a small set of disciplines.
|
|
Operational Database
|
designed primarily to support a company's day-to-day operations. Also known as an online transaction processing, transactional, or production database.
|
|
Analytical Databases
|
Focus primarily on storing historical data and business metrics used exclusively for tactical or strategic decision making.
Typically requires extensive "data massaging"(manipulation) |
|
Data Warehouse
|
a specialized database that stores data in a format optimized for decision support. Contains historical data obtained from the operational databases as well as data from other external sources
|
|
Online Analytical Processing (OLAP)
|
A set of tools that work together to provide an advanced data analysis environment for retrieving, processing, and modeling data from the data warehouse
|
|
Business intelligence
|
Describes a comprehensive approach to capture and process business data with the purpose of generating information to support business decision making.
|
|
Unstructured Data
|
Data that exist in their original (raw) state - in the format they were collected - does not lend itself to the processing that yields information.
|
|
Structured Data
|
the result of formatting unstructured data to facilitate storage, use, and the generation of information.
Apply structure (format) based on the type of processing that you intend to perform on the data. |
|
Semistructured Data
|
Data that has already been processed to some extent.
|
|
XML Databases
|
A new generation of databases for storage and management of semistructured data.
|
|
XML (Extensible Markup Language)
|
Special language used to represent and manipulate data elements in a textual format.
|
|
Social Media
|
Refers to Web and mobile technologies that enable "anywhere, anytime, always on" human interactions.
|
|
NoSQL Database (Not Only SQL)
|
new generation of database management systems that is not based on the traditional relational database model.
|
|
Database design
|
Refers to the activities that focus on the design of the database structure that will be used to store and manage end-user data.
|
|
__________ is one of an organization's most valuable assets
|
Data
|
|
Field
|
A character of group of characters (*alpha or numeric) that has a specific meaning. A field is used to define and store data.
|
|
Record
|
A logically connected set of one or more fields that describes a person, place, or thing.
|
|
File
|
A collection of related records
|
|
Structural dependence
|
access to a file is dependent on its structure
|
|
Structural Independence
|
exists when you can change the file structure without affecting the application's ability to access the data.
|
|
Data Dependence
|
all data access programs subject to change when any of the file's data storage characteristics change (changing the data type)
|
|
Data Independence
|
You can change the data storage characteristics without affecting the program's ability to access the data.
|
|
Logical Data Format
|
How humans view the data
|
|
Physical Data Format
|
How the computer must work with the data.
|
|
Islands of Information
|
The organizations structure promotes the storage of the same basic data in different, scattered locations. The file system's structure makes it difficult to combine data from multiple sources and its lack of security renders the file system vulnerable to security breaches.
|
|
Redundancy
|
The same data are stored unnecessarily at different places.
|
|
Uncontrolled Data Redundancy Results in:
|
Poor data security
Data inconsistency Data anomalies |
|
Data integrity
|
the condition in which all data in the database are consistent with real-world events and conditions.
* data are accurate * data are verifiable |
|
Data Anomaly
|
Develops when not all of the required changes in the redundant data are made successfully.
|
|
Database System
|
Refers to an organization of components that define and regulate the collection, storage, management, and use of data within a database environment.
Composed of Five Major Parts: Hardware Software People Procedures Data |
|
Hardware
|
Refers to all of the system's physical devices, including computers, servers and supercomputers, storage devices, printers, network devices (hubs, switches, routers, fiber optics) and other devices (ATMs, ID readers...)
|
|
Software
|
Three types of software are needed to make the database system function fully: OS Software, DBMS Software, and Application Programs and Utilities.
|
|
__________________ programs are most commonly used to access data within the database to generate reports, tabulations, and other information to facilitate decision making.
|
Application
|
|
System Administrators
|
Oversee the database system's general operations
|
|
Database administrators
|
manage the DBMS and ensure that the database is functioning properly
|
|
Database Designers
|
Design the database structure, the database architects.
|
|
System Analysts and programmers
|
Design and implement the application programs. Design and create the data-entry screens, reports, and procedures through which end users access and manipulate the database's data
|
|
End Users
|
the people who use the application programs to run the organization's daily operations
|
|
Procedures
|
the instructions and rules that govern the design and use of the database system. Critical component of the system. Enforce the standards by which business is conducted within the organization and with customers.
|
|
Data Dictionary
|
definitions of the data elements and their relationships (metadata)
Used by the DBMS |
|
Performance Tuning
|
the activities that make the database perform more efficiently in terms of storage and access speed
|
|
Query Language
|
Non-procedural language, lets the user specify what must be done without having to specify how.
|
|
Structured Query Language (SQL)
|
the de facto query language and data access standard supported by the majority of DBMS vendors
|
|
Data Model
|
a relatively simple representation , usually graphical, of more complex real-world data structures. a Model is an abstraction of a more complex real-world even and the function is to help you understand the complexities of the real-world environment.
A Communication Tool! |
|
T/F
Data modeling is an iterative, progressive process. |
True
start with a simple understanding of the problem domain, as understanding increases, so does the level of detail of the data model. Should contain at least: * description of the data structure that will store the end-user data. * set of enforceable rules to guarantee the integrity of the data. * data manipulation methodology to support the real-world data transformations. |
|
Attribute
|
Characteristic of an entity.
the equivalent of fields in a file system |
|
Relationship
|
Association among entities.
Data models use three types: one-to-many 1:M or 1..* many-to-many M:N or *..* one-to-one 1:1 or 1..1 |
|
Constraint
|
Restriction placed on the data.
Important because they help to ensure data integrity and are normally expressed in the form of rules. i.e., A student's GPA must be between 0.00 and 4.00 |
|
Business Rule
|
A brief, precise, and unambiguous description of a policy, procedure or principle within a specific organization. Derived from a detailed description of an organization's operations help to create and enforce actions within that organization's environment.
Used to define entities, attributes, relationships and constraints. |
|
Reasons the process of identifying and documenting business rules is essential to database design:
|
* help to standardize the company's view of data
* can be a communications tool between users and designers. * allow the designer to understand the nature, role and scope of the data * allow the designer to understand business processes. * allow the designer to develop appropriate relationship participation rules and constraints and to create an accurate data model. |
|
Noun Translation in a Business Rule
|
an entity in the model
|
|
Verb Translation in a Business Rule
|
(active or passive) that associates the nouns will translate into a relationship among the entities.
|
|
Two questions to ask when properly identifying the relationship type:
|
1. How many instances of B are related to one instance of A?
2. How many instances of A are related to one instance of B? |
|
Hierarchical Data Model
|
developed in the 1960s to manage large amounts of data for complex manufacturing projects. Basic logical structure is represented by an upside-down tree and contains levels, or segments
(Each parent can have many children, but each child has only one parent) |
|
Segment (Hierarchical Data Model)
|
The equivalent of a file system's record type.
|
|
Network Model
|
Created to represent complex data relationships more effectively than the hierarchical model to improve performance and to impose a database standard.
User perceives the network database as a collection of records in a 1:M relationship. (Allows a record to have more than one parent.) |
|
Schema
|
Conceptual organization of the entire database as viewed by the database administrator.
|
|
Subschema
|
Defines the portion of the database "seen" by the application programs that actually produce the desired information from the data within the database.
|
|
Data Manipulation Language (DML)
|
Defines the environment in which data can be managed and is used to work with the data in the database.
|
|
Data Definition Language (DDL)
|
Enables the database administrator to define the schema components.
|
|
Relational Model
|
Produced an "automatic transmission" database to replace the "standard transmission" databases that preceded it.
Foundation is a mathematical concept known as a relation(sometimes called a table) as a matrix composed of intersecting rows and columns Each row is called a tuple. Each column represents an attribute. |
|
RDBMS
|
Relational Database Management System - performs the same basic functions provided by the hierarchical and network DBMS systems, in addition to a host of other functions that make the relational data model easier to understand and implement.
|
|
Relational Diagram
|
representation of the relational database's entities, the attributes within those entities, and the relationships between those entities.
|
|
Relational Table
|
Stores a collection of related entities.
|
|
Three Parts of a SQL_based Relational Database Application
|
1. End-User Interface: allows the end user to interact with the data (by automatically generating SQL code)
2. Set of Tables stored in the database - each table is independent, rows in different tables are related by common values in common attributes. 3. SQL Engine: executes all queries, or data requests. Is part of the DBMS software. |
|
Entity Relationship Model (ERM)
|
graphical tool in which entities and their relationships are pictured., has become a widely accepted standard for data modeling.
Combined with the Relational Data Model, provide the foundation for tightly structured database design. |
|
Entity
|
Anything about which data will be collected and stored. An Entity is a noun, represented int the ERD by a rectangle, with the name written in singular form, all caps in the center.
|
|
Entity Instance/Entity Occurrence
|
Each row in the relational table.
|
|
Entity Set
|
A collection of like entities
|
|
Object-Oriented Data Model
|
Both data and their relationships are contained in a single structure. Described by its factual content.
|
|
OODM Components
|
*An object is an abstraction of a real-world entity. (Equivalent to an ER Model's entity. Represents only one occurrence of an entity
*Attributes describe the properties of an object. i.e., PERSON object includes the attributes: SS# and DOB *Objects that share similar characteristics are grouped in classes, contains a set of procedures (methods) which define an object's behavior *Classes are organized in a class hierarchy which resembles an upside-down tree in which each class has only one parent *Exhibits Inheritance - the ability of an object within the class hierarchy to inherit the attributes and methods of the classes above it. |
|
Class
|
Collection of similar objects with shared structure (attributes) and behavior (methods)
|
|
Method
|
represents a real-world action such as finding a person's name, changing a person's name or printing a person's address. The equivalent of a procedure in traditional programming languages.
OO methods define an object's behavior |
|
Unified Modeling Language (UML)
|
language based on OO concepts that describes a set of diagrams and symbols you can use to graphically model a system's modeling language.
|
|
Extended relational Data Model (ERDM)
|
adds many of the OO model's features within the inherently simpler relational database structure.
|
|
Extensible Markup Language (SML)
|
de facto standard for the efficient and effective exchange of structured, semistructured, and unstructured data.
|
|
Big Data
|
refers to a movement to find new and better ways to manage large amounts of Web-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost.
|
|
NoSQL
|
refers to a new generation of databases that address the specific challenges of the Big Data era and have the following general characteristics:
* not based on the relational model * supports distributed database architectures. * provides high scalability, high availability, and fault tolerance * Supports very large amounts of sparse data. * geared toward performance rather then transaction consistency. |
|
Key-Value
|
Data model based on a structure composed of two data elements: a key and a value, in which every key has a corresponding value or set of values.
|
|
In the relational model, every row represents:
|
a single entity occurrence
|
|
In the relational model, every column represents:
|
an attribute of the entity occurrence and has a defined data type
|
|
In the key-value data model, every row represents:
|
one attribute of one entity instance.
|
|
In the key-value data model, every column represents:
|
the "key" column points to an attribute and the "value" column contains the actual value for the attribute.
|
|
Sparse Data
|
the number of attributes is vary large but the number of actual data instances is low.
|
|
Eventual Consistency
|
updates to the database will propagate through the system and eventually all data copies will be consistent - not guaranteed to be consistent for a period of time after an update.
|
|
External Schema
|
a specific representation of an external view
|
|
Conceptual Model (Conceptual Schema)
|
represents a global view of the entire database by the entire organization; integrates all external views (entities, relationships, constraints, and processes) into a single global view of the data in the enterprise. The basis for the identification and high-level description of the main data objects (avoiding any database model-specific details).
|
|
Internal Model
|
the representation of the database as "seen" bu the DBMS. Requires the designer to match the conceptual model's characteristics and constraints to those of the selected implementation model.
|
|
Internal Schema
|
depicts a specific representation of an internal model, using the database constructs supported by the chosen database.
|
|
Physical Model
|
operates at the lowest level of abstraction, describing the way data are saved on storage media such as disks or tapes. Requires the definition of both the physical storage devices and the (physical) access methods required to reach the data within those storage devices - both software and hardware dependent.
|
|
Table
|
Perceived as a two-dimensional structure composed of rows and columns, also called a relation whose contents can be permanently saved for future use - contains a group of related entity occurrences (entity set)
|
|
Characteristics of a Relational Table
|
1. Perceived as a 2D structure composed of rows and columns.
2. Each table row (tuple) represents a single entity occurrence within the entity set. 3. Each table column represents an attribute, each column has a distinct name 4. Each intersection of a row and column represents a single data value. 5. All values in a column must conform to the same data format 6. Each column has a specific range of values known as the attribute domain. 7. The order of the rows and columns is immaterial to the DBMS. 8. Each table must have an attribute or combination of attributes that uniquely identifies each row. |
|
Domain
|
a column's range of permissible values
|
|
Primary Key (PK)
|
an attribute or combination of attributes that uniquely identifies any given row.
Two Requirements: 1. all of the values in the primary key must be unique. 2. no key attribute in the primary key can contain a null. |
|
Determination
|
The state in which knowing the value of one attribute makes it possible to determine the value of another
|
|
Functional Dependence
|
the value of one or more attributes determines the value of one or more other attributes.
|
|
Determinant (Key)
|
the attribute whose value determines another
|
|
Dependent
|
The attribute whose value is determined by the other attribute
|
|
Full Functional Dependence
|
term used to refer to the functional dependencies in which the entire collection of attributes in the determinant is necessary for the relationship
|
|
Composite Key
|
Key that is composed of more than one attribute.
|
|
Candidate Key
|
Minimal superkey (irreducible) - a superkey that does not contain a subset of attributes that is itself a superkey. The eligible options from which the designer will choose when selecting the primary key.
|
|
Entity Integrity
|
The condition in which each row (entity instance) in the table has its own unique identity.
|
|
Null
|
The absence of any data
|
|
Foreign Key
|
The primary key of one table that has been placed into another table to create a common attribute
|
|
Referential Integrity
|
the condition in which every reference to an entity instance by another entity instance is valid. Every foreign key entry must either be null or a valid value in the primary key of the related table.
|
|
Secondary key
|
a key that is used strictly for data retrieval purposess
|
|
Superkey
|
An attribute or combination of attributes that uniquely identifies each row in a table
|
|
Relational Algebra
|
defines the theoretical way of manipulating table contents using the eight relational operators: SELECT, PROJECT, JOIN, INTERSECT, UNION, DIFFERENCE, PRODUCT, and DIVIDE.
|
|
Closure
|
the use of relational algebra operators on existing relations (tables) produces new relations
|
|
SELECT
|
Also known as RESTRICT, yields values for all rows found in a table that satisfy a given condition - can be used to list all of the row values, or it can yield only row values that match a specified criterion.
|
|
PROJECT
|
Yields all values for selected attributes - yields a vertical subset of a table
|
|
UNION
|
Combines all rows from two tables, excluding duplicate rows.
|
|
Union-Compatible
|
When two or more tables share the same number of columns and when their corresponding columns share the same or compatible domains
|
|
INTERSECT
|
Yields only the rows that appear in both tables. The tables must be union-compatible to yield results.
|
|
PRODUCT
|
Yields all possible pairs of rows from two tables - also known as the Cartesian product.
|
|
JOIN
|
Allows information to be combined from two or more tables.
** The real power behind the relational database, allowing the use of independent tables linked by common attributes. |
|
Natural Join
|
links tables by selecting only the rows with common values in their common attributes.
3 Stage Process: 1. PRODUCT of the table is created 2. SELECT is performed on the output of 1st step to yield only the rows where the values are equal 3. PROJECT is performed on the results of step 2 to yield a single copy of each attribute, thereby eliminating duplicate columns. |
|
Equijoin
|
Links tables on the basis of an equality condition that compares specified columns of each table.
|
|
Inner Join
|
only returns matched records from the tables that are being joined.
|
|
Outer Join
|
the matched pairs would be retained and any unmatched values in the other table would be left null
|