LIBR 246-15
Information Technology Tools and Applications – Advanced
Topic: Big Data Analytics and Management
Fall 2014 Greensheet

Dr. Michelle Chen
E-mail

Office Hours: 
Virtually, by appointment via e-mail or Blackboard IM. Blackboard Collaborate optional drop-in office hours will also be held as needed. More details TBA on the Canvas course website.


Greensheet Links
Textbooks
SLOs
Competencies
Prerequisites
Resources
Canvas Login and Tutorials
iSchool eBookstore
 

Canvas Information: This course will be available beginning Monday, August 25th, 12AM PST. You will be enrolled into the site automatically.

Course Description

As the prevalence of advanced web and storage technologies, more and more data are produced in massive amounts at a rapid rate. The data that come from instruments, sensors, Internet transactions, emails, click streams, and/or all other digital sources require a new way of analyzing, interpreting, learning, and managing data efficiently. As a result, "big data" analytics and management has gained increasing attention and become one of the most significant, emerging fields in many disciplines, ranging from business intelligence to scientific discovery.

In this course, students will explore important big data technologies, trends, infrastructure, and management issues that enable users to make informed and strategic decisions with the presence of large-scale data sets. Specifically, the course will consist of three main parts:

  1. Big data infrastructure: Students will focus on learning big data technologies and trends, including large-scale databases, map-reduce paradigm, and big data mining.
     
  2. Big data hands-on practice: Students will gain hands-on learning experience with the software "Splunk". Students will be able to conduct big data analysis and visualization in Splunk with real-world data sets such as Twitter data (i.e., "tweets").
     
  3. Big data real-world uses: Students will discuss how big data analytics and management skills can be applied to different real-world fields, such as libraries and health science, and various issues including advantages and challenges.

Course Requirements

Assignments

  • Participation and Discussions (15%, supports SLO#1, SLO#2)
    Students are required to actively participate in class and make thoughtful contributions to three class discussions posted on the course site. Students will be evaluated for the involvement in and intellectual contribution to the collaborative learning environment.

  • Hands-on Practices (45%, supports SLO#1, SLO#2, SLO#3, SLO#5)
    Three individual, hands-on practices will be given throughout the semester to help students review and reinforce what they have learned in class. Students will learn how to analyze and visualize big data with ManyEyes, Weka, and Splunk.
     
  • Case Studies (15%, supports SLO#4)
    Students will discuss social, organizational, and managerial issues of big data applications in practice through various case studies and submit a case essay that answers and critiques the questions from the cases.
     
  • Semester Project (25%, supports SLO#1SLO#2SLO#3SLO#4SLO#5)
    Students will work in groups on a semester project that consists of three phases (more details TBA on the Canvas site):
     
    • Milestone I - Initial Thoughts: Students will submit a short paragraph discussing the potential topics and directions of the group project. Students will also briefly present the motivation of the study and the approach that might be taken.
       
    • Milestone II - Mini Report: Students will submit a one-page report outlining the current progress of the group project. The report will include what has been done, what the current status and results are, and what needs to be accomplished.
       
    • Final Presentation and Report: Students will give asynchronous 15-minute presentations on Collaborate and share the recording links with the whole class. Students will also submit a detailed, 10-page report for the project. The report should at least include the following sections: motivation, problem statement, methodology, analysis results, and discussions and conclusion. 

Course Calendar (subject to change with fair notice)

Weeks Topics and Due Dates
Week 1
Aug 25-31
Introduction and Course Overview
Week 2
Sep 1-7
Big Data Visualization
Week 3
Sep 8-14
Large-Scale Database Management
Hands-on #1 Due Sep 14
Week 4
Sep 15-21
Hadoop and Map-Reduce Paradigm
Discussion #1 Due Sep 21 
Week 5
Sep 22-28

Big Data Mining
Project Milestone I Due Sep 28

Week 6
Sep 29 - Oct 5
Introduction to Splunk
Discussion #2 Due Oct 5
Week 7
Oct 6-12
Splunk Demos and Applications
Week 8
Oct 13-19
Splunk I: The Basics
Hands-on #2 Due Oct 19
Week 9
Oct 20-26
Splunk II: Advanced Topics
Week 10
Oct 27 - Nov 2
Splunk III: Semantic Analysis
Hands-on #3 Due Nov 2
Week 11
Nov 3-9
Big Data Case Studies
Project Milestone II Due Nov 9
Week 12
Nov 10-16

Big Data Case Studies (cont.)
Discussion #3 Due Nov 16

Week 13
Nov 17-23

Big Data Challenges
Case Essay Due Nov 23

Week 14
Nov 24-30
Thanksgiving Break
Week 15
Dec 1-7
Course wrap-up
Week 16
Dec 8-10
Final Presentation and Report Due Dec 10

Grading

Deliverable Points (Total = 1000)
Participation and Discussions Discussion #1: 50
Discussion #2: 50
Discussion #3: 50
Hands-on Practices Hands-on #1: 150
Hands-on #2: 150
Hands-on #3: 150
Case Studies Case Essay: 150
Semester Project Milestone I: 20 Milestone II: 30 Final Presentation: 100 Final Report: 100

All assignments must be submitted by 11:59PM (PST) on the day the assignment is due. Late assignments will be reduced by 20% of point value per day late. Please contact Dr. Chen if a medical or a family/personal emergency prevents you from submitting an assignment on time.

Course Workload Expectations

Success in this course is based on the expectation that students will spend, for each unit of credit, a minimum of forty-five hours over the length of the course (normally 3 hours per unit per week with 1 of the hours used for lecture) for instruction or preparation/studying or course related activities including but not limited to internships, labs, clinical practica. Other course structures will have equivalent workload expectations as described in the syllabus.

Instructional time may include but is not limited to:
Working on posted modules or lessons prepared by the instructor; discussion forum interactions with the instructor and/or other students; making presentations and getting feedback from the instructor; attending office hours or other synchronous sessions with the instructor.

Student time outside of class:
In any seven-day period, a student is expected to be academically engaged through submitting an academic assignment; taking an exam or an interactive tutorial, or computer-assisted instruction; building websites, blogs, databases, social media presentations; attending a study group;contributing to an academic online discussion; writing papers; reading articles; conducting research; engaging in small group work.

Course Prerequisites

LIBR 202other prerequisites may be added depending on content. 

Student Learning Outcomes

Upon successful completion of the course, students will be able to:

  1. Describe and explain how the main technologies and trends in big data work, specifically data visualization, large-scale database management, map-reduce paradigm, and big data mining.
  2. Demonstrate proficiency in using Splunk to solve big data analytical problems.
  3. Interpret and communicate big data analysis and visualization results appropriately, effectively and accurately.
  4. Discuss, articulate and compare various big data management issues (e.g., big data privacy) in different practices such as libraries and business.
  5. Make informed and strategic decisions with the presence of large-scale data sets.

Core Competencies (Program Learning Outcomes)

LIBR 246 supports the following core competencies:

  1. E Design, query and evaluate information retrieval systems.
  2. H Demonstrate proficiency in identifying, using, and evaluating current and emerging information and communication technologies.

Textbooks

Recommended Textbooks:

  • Sathi, A. (2012). Big data analytics: Disruptive technologies for changing the game. Boise, ID: MC Press Online. Available through Amazon: 1583473807arrow gif indicating link outside sjsu domain

Grading Scale

The standard SJSU School of Information Grading Scale is utilized for all iSchool courses:

97 to 100 A
94 to 96 A minus
91 to 93 B plus
88 to 90 B
85 to 87 B minus
82 to 84 C plus
79 to 81 C
76 to 78 C minus
73 to 75 D plus
70 to 72 D
67 to 69 D minus
Below 67 F

 

In order to provide consistent guidelines for assessment for graduate level work in the School, these terms are applied to letter grades:

  • C represents Adequate work; a grade of "C" counts for credit for the course;
  • B represents Good work; a grade of "B" clearly meets the standards for graduate level work;
    For core courses in the MLIS program (not MARA) — INFO 200, INFO 202, INFO 204 — the iSchool requires that students earn a B in the course. If the grade is less than B (B- or lower) after the first attempt you will be placed on administrative probation.  You must repeat the class the following semester. If -on the second attempt- you do not pass the class with a grade of B or better (not B- but B) you will be disqualified.
  • A represents Exceptional work; a grade of "A" will be assigned for outstanding work only.

Students are advised that it is their responsibility to maintain a 3.0 Grade Point Average (GPA).

University Policies

General Expectations, Rights and Responsibilities of the Student

As members of the academic community, students accept both the rights and responsibilities incumbent upon all members of the institution. Students are encouraged to familiarize themselves with SJSU's policies and practices pertaining to the procedures to follow if and when questions or concerns about a class arises. See University Policy S90-5 at http://www.sjsu.edu/senate/docs/S90-5.pdf. More detailed information on a variety of related topics is available in the SJSU catalog at http://info.sjsu.edu/web-dbgen/catalog/departments/LIS.html. In general, it is recommended that students begin by seeking clarification or discussing concerns with their instructor. If such conversation is not possible, or if it does not serve to address the issue, it is recommended that the student contact the Department Chair as a next step.

Dropping and Adding

Students are responsible for understanding the policies and procedures about add/drop, grade forgiveness, etc. Refer to the current semester's Catalog Policies section at http://info.sjsu.edu/static/catalog/policies.html. Add/drop deadlines can be found on the current academic year calendars document on the Academic Calendars webpage at http://www.sjsu.edu/provost/services/academic_calendars/. The Late Drop Policy is available at http://www.sjsu.edu/aars/policies/latedrops/policy/. Students should be aware of the current deadlines and penalties for dropping classes.

Information about the latest changes and news is available at the Advising Hub at http://www.sjsu.edu/advising/.

Consent for Recording of Class and Public Sharing of Instructor Material

University Policy S12-7, http://www.sjsu.edu/senate/docs/S12-7.pdf, requires students to obtain instructor's permission to record the course and the following items to be included in the syllabus:

  • "Common courtesy and professional behavior dictate that you notify someone when you are recording him/her. You must obtain the instructor's permission to make audio or video recordings in this class. Such permission allows the recordings to be used for your private, study purposes only. The recordings are the intellectual property of the instructor; you have not been given any rights to reproduce or distribute the material."
    • It is suggested that the syllabus include the instructor's process for granting permission, whether in writing or orally and whether for the whole semester or on a class by class basis.
    • In classes where active participation of students or guests may be on the recording, permission of those students or guests should be obtained as well.
  • "Course material developed by the instructor is the intellectual property of the instructor and cannot be shared publicly without his/her approval. You may not publicly share or upload instructor generated material for this course such as exam questions, lecture notes, or homework solutions without instructor consent."

Academic integrity

Your commitment, as a student, to learning is evidenced by your enrollment at San Jose State University. The University Academic Integrity Policy F15-7 at http://www.sjsu.edu/senate/docs/F15-7.pdf requires you to be honest in all your academic course work. Faculty members are required to report all infractions to the office of Student Conduct and Ethical Development. The Student Conduct and Ethical Development website is available at http://www.sjsu.edu/studentconduct/.

Campus Policy in Compliance with the American Disabilities Act

If you need course adaptations or accommodations because of a disability, or if you need to make special arrangements in case the building must be evacuated, please make an appointment with me as soon as possible, or see me during office hours. Presidential Directive 97-03 at http://www.sjsu.edu/president/docs/directives/PD_1997-03.pdf requires that students with disabilities requesting accommodations must register with the Accessible Education Center (AEC) at http://www.sjsu.edu/aec to establish a record of their disability.

icon showing link leads to the PDF file viewer known as Acrobat Reader Download Adobe Acrobat Reader to access PDF files.

More accessibility resources.