Big Data Modeling and Management Systems
Description
When you enroll for courses through Coursera you get to choose for a paid plan or for a free plan .
- Free plan: No certicification and/or audit only. You will have access to all course materials except graded items.
- Paid plan: Commit to earning a Certificate—it's a trusted, shareable way to showcase your new skills.
About this course: Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools. Through guided hands-on tutorials, you will become familiar with techniques using real-time and semi-structured data examples. Systems and tools discussed include: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existin…
Frequently asked questions
There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.
When you enroll for courses through Coursera you get to choose for a paid plan or for a free plan .
- Free plan: No certicification and/or audit only. You will have access to all course materials except graded items.
- Paid plan: Commit to earning a Certificate—it's a trusted, shareable way to showcase your new skills.
About this course: Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools. Through guided hands-on tutorials, you will become familiar with techniques using real-time and semi-structured data examples. Systems and tools discussed include: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources. At the end of this course, you will be able to: * Recognize different data elements in your own work and in everyday life problems * Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design * Identify the frequent data operations required for various types of data * Select a data model to suit the characteristics of your data * Apply techniques to handle streaming data * Differentiate between a traditional Database Management System and a Big Data Management System * Appreciate why there are so many data management systems * Design a big data information system for an online game company This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications. Hardware Requirements: (A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size. Software Requirements: This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
Created by: University of California, San Diego-
Taught by: Ilkay Altintas, Chief Data Science Officer
San Diego Supercomputer Center -
Taught by: Amarnath Gupta, Director, Advanced Query Processing Lab
San Diego Supercomputer Center (SDSC)
Each course is like an interactive textbook, featuring pre-recorded videos, quizzes and projects.
Help from your peersConnect with thousands of other learners and debate ideas, discuss course material, and get help mastering concepts.
CertificatesEarn official recognition for your work, and share your success with friends, colleagues, and employers.
University of California, San Diego UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory.Syllabus
WEEK 1
Introduction to Big Data Modeling and Management
Welcome to this course on big data modeling and management. Modeling and managing data is a central focus of all big data projects. In these lessons we introduce you to the concepts behind big data modeling and management and set the stage for the remainder of the course.
14 videos, 8 readings expand
- Video: Welcome to Big Data Modeling and Management
- Video: Why is this a New Course in the Big Data Specialization?
- Discussion Prompt: Getting to know you: Tell us about yourself and why you are taking this course
- Video: Summary of Introduction to Big Data (Part 1)
- Video: Summary of Introduction to Big Data (Part 2)
- Video: Summary of Introduction to Big Data (Part 3)
- Reading: Slides: Summary of Introduction to Big Data
- Video: Big Data Management "Must-Ask Questions"
- Video: Data Ingestion
- Video: Data Storage
- Video: Data Quality
- Video: Data Operations
- Video: Data Scalability and Security
- Reading: Slides: Big Data Management
- Discussion Prompt: Let's discuss: What area of big data management interests you most?
- Reading: Reading on Storage Systems
- Video: Energy Data Management Challenges at ConEd
- Reading: Slides: Energy Data Management Challenges at ConEd
- Video: Gaming Industry Data Management: Q&A with Apmetrix CTO Mark Caldwell
- Video: Flight Data Management at FlightStats: A Lecture by CTO Chad Berkley
- Reading: Slides: Flight Data Management at FlightStats
- Discussion Prompt: Let's discuss: What are the design criteria in the big data applications you have heard?
- Reading: Downloading and Installing the Cloudera VM Instructions (Windows)
- Reading: Downloading and Installing the Cloudera VM Instructions (Mac)
- Reading: Instructions for Downloading Hands On Datasets
WEEK 2
Big Data Modeling
Modeling big data depends on many factors including data structure, which operations may be performed on the data, and what constraints are placed on the models. In these lessons you will learn the details about big data modeling and you will gain the practical skills you will need for modeling your own big data projects.
11 videos, 8 readings expand
- Video: Introduction to Data Models
- Video: Data Model Structures
- Video: Data Model Operations
- Video: Data Model Constraints
- Reading: Slides: What Is A Data Model?
- Discussion Prompt: Let's discuss: Modeling data in your daily life
- Reading: Introduction to CSV Data
- Video: Introduction to CSV Data
- Video: What is a Relational Data Model?
- Reading: Slides: What Is A Relational Data Model?
- Video: What is a Semistructured Data Model?
- Reading: Slides: What is a Semistructured Data Model?
- Discussion Prompt: Let's discuss: Utilization of XML or JSON on the Internet
- Reading: Exploring the Relational Data Model of Comma Separated Values (CSV)
- Video: Exploring the Relational Data Model of CSV Files
- Reading: Exploring the Semistructured Data Model of JSON data
- Video: Exploring the Semistructured Data Model of JSON data
- Reading: Exploring the Array Data Model of an Image
- Video: Exploring the Array Data Model of an Image
- Reading: Exploring Sensor Data
- Video: Exploring Sensor Data
Graded: Practical Quiz for Week 2 Hands-On Lectures
WEEK 3
Big Data Modeling (Part 2)
These lessons continue to shed light on big data modeling with specific approaches including vector space models, graph data models, and more.
5 videos, 5 readings expand
- Video: Vector Space Model
- Reading: Slides: Vector Space Model
- Video: Graph Data Model
- Reading: Slides: Graph Data Model
- Video: Other Data Models
- Reading: Slides: Other Data Models
- Reading: Exploring Vector Data Models with Lucene
- Video: Exploring the Lucene Search Engine's Vector Data Model
- Reading: Exploring Graph Data Models with Gephi
- Video: Exploring Graph Data Models with Gephi
Graded: Data Models Quiz
WEEK 4
Working With Data Models
Data models deal with many different types of data formats. Streaming data is becoming ubiquitous, and working with streaming data requires a different approach from working with static data. In these lessons you will gain practical hands-on experience working with different forms of streaming data including weather data and twitter feeds.
6 videos, 7 readings expand
- Video: Data Model vs. Data Format
- Reading: Slides: Data Model vs. Data Format
- Video: What is a Data Stream?
- Reading: Slides: What is a Data Stream?
- Video: Why is Streaming Data different?
- Reading: Slides: Why is Streaming Data Different?
- Video: Understanding Data Lakes
- Reading: Slides: Understanding Data Lakes
- Discussion Prompt: Let's discuss: Streaming data applications
- Reading: Exploring Streaming Sensor Data
- Video: Exploring Streaming Sensor Data
- Reading: Instructions for Creating a Twitter App (Optional)
- Reading: Exploring Streaming Twitter Data (Optional)
- Video: Exploring Streaming Twitter Data (Optional)
Graded: Data Formats and Streaming Data Quiz
WEEK 5
Big Data Management: The "M" in DBMS
Managing big data requires a different approach to database management systems because of the wide variation in data structure which does not lend itself to traditional DBMSs. There are many applications available to help with big data management. In these lessons we introduce you to some of these applications and provide insight into how and when they might be appropriate for your own big data management challenges.
7 videos, 2 readings expand
- Video: DBMS-based and non-DBMS-based Approaches to Big Data
- Reading: Slides: DBMS-based and non-DBMS-based Approaches to Big Data
- Video: From DBMS to BDMS
- Video: Redis: An Enhanced Key-Value Store
- Video: Aerospike: a New Generation KV Store
- Video: Semistructured Data – AsterixDB
- Video: Solr: Managing Text
- Video: Relational Data – Vertica
- Reading: Slides: From DBMS to BDMS
Graded: BDMS Quiz
WEEK 6
Designing a Big Data Management System for an Online Game
In these lessons we give you the opportunity to learn about big data modeling and management using a fictitious online game called "Catch the Pink Flamingo".
1 reading expand
- Reading: A Game by Eglence Inc. : Catch The Pink Flamingo
- Discussion Prompt: Let's discuss: Analytical tasks to make Catch the Pink Flamingo better
- Discussion Prompt: Let's discuss: Using the data model for Catch the Pink Flamingo
Graded: Designing a Data Model for 'Catch the Pink Flamingo'
Share your review
Do you have experience with this course? Submit your review and help other people make the right choice. As a thank you for your effort we will donate $1.- to Stichting Edukans.There are no frequently asked questions yet. If you have any more questions or need help, contact our customer service.