INFO 246-13
Information Technology Tools and Applications – Advanced
Topic: Big Data Analytics and Management
Fall 2018 Syllabus
Dr. Michelle Chen
E-mail
Office Hours: Virtually, by appointment via e-mail. Zoom optional drop-in office hours will also be held as needed. More details TBA on the Canvas course website.
Syllabus Links Textbooks Course Learning Outcomes Competencies Prerequisites |
Resources Canvas Login and Tutorials iSchool eBookstore |
Canvas Information: Courses will be available beginning August 21st at 6 a.m. Pacific Time unless you are taking an intensive or a one-unit or two-unit class that starts on a different day. In that case, the class will open on the first day that the class meets.
You will be enrolled into the Canvas site automatically.
Be sure to logon to the course site no later than Friday, August 24th, to begin the first lesson.
The syllabus may be revised slightly before or during the semester (with fair notice).
Course Description
As the prevalence of advanced web and storage technologies, more and more data are produced in massive amounts at a rapid rate. The data that come from instruments, sensors, Internet transactions, emails, click streams, and/or all other digital sources require a new way of analyzing, interpreting, learning, and managing data efficiently. As a result, "big data" analytics and management has gained increasing attention and become one of the most significant, emerging fields in many disciplines, ranging from business intelligence to scientific discovery.
In this course, students will explore important big data technologies, trends, infrastructure, and management issues that enable users to make informed and strategic decisions with the presence of large-scale datasets. Specifically, the course will consist of three main parts:
- Big data infrastructure: Students will focus on learning big data technologies and trends, including large-scale databases, map-reduce paradigm, and big data mining.
- Big data hands-on practice: Students will gain hands-on learning experience with the software "Splunk". Students will be able to conduct big data analysis and visualization in Splunk with real-world data sets such as Twitter data (i.e., "tweets"). (Please note: Some experiences in scripting languages may be helpful, but are not required.)
- Big data real-world uses: Students will discuss how big data analytics and management skills can be applied to different real-world fields, such as libraries and health science, and various issues including opportunities and challenges.
Course Requirements
Assignments
- Participation and Discussions (10%, supports CLO#1, CLO#2)
Students are required to actively participate in class and make thoughtful contributions to three class discussions posted on the course site. Students will be evaluated for the involvement in, and intellectual contribution to, the collaborative learning environment. - Hands-on Practices (50%, supports CLO#1, CLO#2, CLO#3, CLO#5)
Three individual, hands-on practices will be given throughout the semester to help students review and reinforce what they have learned in class. Students will learn how to analyze and visualize big data with practical tools.
- Case Study (15%, supports CLO#4)
Students will discuss social, organizational, and managerial issues of big data applications in practice through various case studies and submit a case essay that answers and critiques the questions from the cases.
- Semester Project (25%, supports CLO#1, CLO#2, CLO#3, CLO#4, CLO#5)
Students will work in groups or solo on a semester project that consists of three phases (more details TBA on the Canvas site): - Milestone I - Initial Thoughts: Students will submit a short paragraph discussing the potential topics and directions of the semester project. Students will also briefly present the motivation of the study and the approach that might be taken.
- Milestone II - Mini Report: Students will submit a one-page report outlining the current progress of the semester project. The report will include what has been done, what the current status and results are, and what needs to be accomplished.
- Final Report and Demo: Students will submit a detailed, 10-page report for the project. The report should at least include the following sections: motivation, problem statement, methodology, analysis results, and discussions and conclusion. Students will also prepare a short "demo" to present their work.
Course Calendar (subject to change with fair notice)
Weeks | Topics and Due Dates |
Week 1 Aug 21-26 |
Introduction and Course Overview Introduction Due Aug 26 |
Week 2 Aug 27 - Sep 2 |
Exploring Big Data: Visualization |
Week 3 Sep 3-9 |
Large-Scale Database Management |
Week 4 Sep 10-16 |
Map-Reduce and Distributed Computing Hands-on #1 Due Sep 16 |
Week 5 Sep 17-23 |
Big Data Mining |
Week 6 Sep 24-30 |
Introduction to Splunk Project Milestone I Due Sep 30 |
Week 7 Oct 1-7 |
Splunk Demos and Applications Discussion #2 Due Oct 7 |
Week 8 Oct 8-14 |
Splunk I: The Basics |
Week 9 Oct 15-21 |
Splunk II: Advanced Topics Hands-on #2 Due Oct 21 |
Week 10 Oct 22-28 |
Splunk Workweek |
Week 11 Oct 29 - Nov 4 |
Splunk III: Real-World Uses Project Milestone II Due Nov 4 |
Week 12 Nov 5-11 |
Big Data Case Studies |
Week 13 Nov 12-18 |
Big Data Case Studies (cont.) |
Week 14 Nov 19-25 |
Thanksgiving Break |
Week 15 Nov 26 - Dec 2 |
Big Data Challenges Case Essay Due Dec 2 |
Week 16 Dec 3-10 |
Course Wrap-up Final Report and Demo Due Dec 10 |
Grading
Deliverables | Points (Total = 100) |
Participation and Discussions (10%) | Introduction: 1 Discussion #1: 3 Discussion #2: 3 Discussion #3: 3 |
Hands-on Practices (50%) | Hands-on #1: 15 Hands-on #2: 17.5 Hands-on #3: 17.5 |
Case Studies (15%) | Case Essay: 15 |
Semester Project (25%) |
Milestone I: 2 |
All assignments must be submitted by 11:59 p.m. Pacific Time on the day the assignment is due. Late assignments will be reduced by 20% of point value per day late. Please contact Dr. Chen if a medical or a family/personal emergency prevents you from submitting an assignment on time.
Course Workload Expectations
Success in this course is based on the expectation that students will spend, for each unit of credit, a minimum of forty-five hours over the length of the course (normally 3 hours per unit per week with 1 of the hours used for lecture) for instruction or preparation/studying or course related activities including but not limited to internships, labs, clinical practica. Other course structures will have equivalent workload expectations as described in the syllabus.
Instructional time may include but is not limited to:
Working on posted modules or lessons prepared by the instructor; discussion forum interactions with the instructor and/or other students; making presentations and getting feedback from the instructor; attending office hours or other synchronous sessions with the instructor.
Student time outside of class:
In any seven-day period, a student is expected to be academically engaged through submitting an academic assignment; taking an exam or an interactive tutorial, or computer-assisted instruction; building websites, blogs, databases, social media presentations; attending a study group;contributing to an academic online discussion; writing papers; reading articles; conducting research; engaging in small group work.
Course Prerequisites
INFO 246 has no prequisite requirements.
Course Learning Outcomes
Upon successful completion of the course, students will be able to:
- Describe and explain how the main technologies and trends in big data work, specifically data visualization, large-scale database management, map-reduce paradigm, and big data mining.
- Demonstrate proficiency in using Splunk to solve big data analytical problems.
- Interpret and communicate big data analysis and visualization results appropriately, effectively and accurately.
- Discuss, articulate and compare various big data management issues (e.g., big data privacy) in different practices such as libraries and business.
- Make informed and strategic decisions with the presence of large-scale data sets.
Core Competencies (Program Learning Outcomes)
INFO 246 supports the following core competencies:
- E Design, query, and evaluate information retrieval systems.
- H Demonstrate proficiency in identifying, using, and evaluating current and emerging information and communication technologies.
Textbooks
Recommended Textbooks:
- Sathi, A. (2012). Big data analytics: Disruptive technologies for changing the game. MC Press Online. Available through Amazon: 1583473807
Grading Scale
The standard SJSU School of Information Grading Scale is utilized for all iSchool courses:
97 to 100 | A |
94 to 96 | A minus |
91 to 93 | B plus |
88 to 90 | B |
85 to 87 | B minus |
82 to 84 | C plus |
79 to 81 | C |
76 to 78 | C minus |
73 to 75 | D plus |
70 to 72 | D |
67 to 69 | D minus |
Below 67 | F |
In order to provide consistent guidelines for assessment for graduate level work in the School, these terms are applied to letter grades:
- C represents Adequate work; a grade of "C" counts for credit for the course;
- B represents Good work; a grade of "B" clearly meets the standards for graduate level work or undergraduate (for BS-ISDA);
For core courses in the MLIS program (not MARA, Informatics, BS-ISDA) — INFO 200, INFO 202, INFO 204 — the iSchool requires that students earn a B in the course. If the grade is less than B (B- or lower) after the first attempt you will be placed on administrative probation. You must repeat the class if you wish to stay in the program. If - on the second attempt - you do not pass the class with a grade of B or better (not B- but B) you will be disqualified. - A represents Exceptional work; a grade of "A" will be assigned for outstanding work only.
Graduate Students are advised that it is their responsibility to maintain a 3.0 Grade Point Average (GPA). Undergraduates must maintain a 2.0 Grade Point Average (GPA).
University Policies
Per University Policy S16-9, university-wide policy information relevant to all courses, such as academic integrity, accommodations, etc. will be available on Office of Graduate and Undergraduate Programs' Syllabus Information web page at: https://www.sjsu.edu/curriculum/courses/syllabus-info.php. Make sure to visit this page, review and be familiar with these university policies and resources.
In order to request an accommodation in a class please contact the Accessible Education Center and register via the MyAEC portal.
Download Adobe Acrobat Reader to access PDF files.
More accessibility resources.