BCSE206L - Foundations of Data Science

Instructor
Dr. Bhargavi R
Slot And Venue Details
Slot : B1 + TB1 Venue: AB3 Block 108
Slot : B2 + TB2 Venue: AB3 Block 106

Course Overview

The "Foundations of Data Science" course provides the essential theoretical framework for data analysis and predictive modeling. It establishes the mathematical language by differentiating qualitative and quantitative data types and covering core descriptive statistics. The curriculum delves into Basic Statistics, focusing on types of statistics, sampling, correlation etc. The basic analysis using SQL is covered in the course. The course includes foundational concepts for the toolsets commonly used, such as Tableau for data visualization principles and Octave for numerical computation and understanding matrix operations. Key concepts in Machine Learning are introduced, and the mechanics of Decision Trees through metrics like Information Gain. The course covers the theoretical necessity of data preparation, including methods for handling missing data and the rationale behind various Feature Selection techniques. More details on the topics covered can be obtained from the syllabus.

Syllabus

You can find the syllabus of this course here

Textbooks

Following are the text books for reference -

Tentative Schedule

Date Lecture Readings Announcements
Welcome to Foundations of Data Science!
Thu, 4th Dec Lecture 1: Meet and Greet, Introduction to Data Science applications [Slides]
  • Ch 1 - Sanjeev Wagh
Mon, 8th Dec Lecture 2: Data Science Introduction continued with different application, Need for DS etc. [Slides]
  • Ch 1 - Sanjeev Wagh
Tue, 9th Dec Lecture 3: Data Science Process [Slides]
  • Ch 1 - Sanjeev Wagh
Thu, 11th Dec Lecture 4: Data Science Process [Slides]
  • Ch 1 - Sanjeev Wagh
Mon, 15th Dec Lecture 5: BI, Data Analysis and Data Analytics, Core componenets of BI [Slides]
  • Ch 1 - Sanjeev Wagh
Tue, 16th Dec Lecture 6: Prerequisites for a Data Scientist, Tools and Skills required [Slides]
  • Ch 1 - Sanjeev Wagh
Thu, 18th Dec Lecture 7: Data Types, Variable types, Descriptive and inferential statistics, Sampling techniques [Slides]
  • Ch 2 - Sanjeev Wagh
Welcome New Year 2026!
Mon, 5th Jan 2026 Lecture 8: Data Analytics life cycle, Discovery [Slides]
  • Ch 4 - Sanjeev Wagh
Tue, 6th Jan 2026 Lecture 9: Dat Preprocessing, handling missing data [Slides]
  • Ch 4 - Sanjeev Wagh
Mon, 12th Jan 2026 Lecture 10: Dat Preprocessing, Transformation, etc. [Slides]
  • Ch 4 - Sanjeev Wagh
Mon, 19th Jan 2026 Lecture 11: Dat Preprocessing, Feature selection [Slides]
  • Ch 4 - Sanjeev Wagh
Tue, 20th Jan 2026 Lecture 12: Model Evaluation,Classification and Regression metrics [Slides]
  • Ch 4 - Sanjeev Wagh
Thu, 22nd Jan 2026 Lecture 13: Discussion [Slides]
Mon, 2nd Feb 2026 Lecture 14: Databases for Data Science - Introduction, pandas for SQL [Slides]
  • Ch 3 - Sanjeev Wagh
Tue, 3rd Feb 2026 Lecture 15: Data Munging with SQL [Slides]
  • Ch 3 - Sanjeev Wagh
Thu, 5th Feb 2026 Lecture 16: Data Munging, Filtering [Slides]
  • Ch 3 - Sanjeev Wagh
Mon, 9th Feb 2026 Lecture 17: Joins [Slides]
  • Ch 3 - Sanjeev Wagh
Tue, 10th Feb 2026 Lecture 18: Window functions and ordered data [Slides]
  • Ch 3 - Sanjeev Wagh
Thu, 12th Feb 2026 Lecture 19: Window functions and ordered data [Slides]
  • Ch 3 - Sanjeev Wagh
Mon, 16th Feb 2026 Lecture 20: Discussion [Slides]
Tue, 16th Feb 2026 Lecture 21: Aggredation [Slides]
  • Ch 3 - Sanjeev Wagh
Thu, 17th Feb 2026 Lecture 22: Preparing data for Analytics tool, NoSQL [Slides]
  • Ch 3 - Sanjeev Wagh
Mon, 23rd Feb 2026 Lecture 23: Data analytics on Text - Introduction, Information Retrieval [Slides]
  • Ch 6 - Sanjeev Wagh
Tue, 24th Feb 2026 Lecture 24: Term document incidance matrix, Inverted index, Boolean retrieval algorithm, Evaluation metrics [Slides]
  • Ch 6 - Sanjeev Wagh
Thu, 26th Feb 2026 Lecture 25: Text mining, stages, preprocessing, tokenization, stemming/lemmetization, text transformation [Slides]
  • Ch 6 - Sanjeev Wagh
Mon, 2nd March 2026 Lecture 26: POS taggging, parsing [Slides]
  • Ch 6 - Sanjeev Wagh
Tue, 3rd March 2026 Lecture 27: Multi nomial Naive Bayes [Slides]
Thu, 5th March 2026 Lecture 28: NLP [Slides]
  • Ch 6 - Sanjeev Wagh
Mon, 9th March 2026 Lecture 29: Practice Session SQL
Tue, 10th March 2026 Lecture 30: Practice Session - Text preprocessing
Thu, 12th March 2026 Lecture 31: Practice Session
Mon, 23rd March 2026 Lecture 32: Pandas [Pandas-Part1]
  • Ch 7 - Sanjeev Wagh
Tue, 24th March 2026 Lecture 33: Pandas [Pandas-Part2]
  • Ch 7 - Sanjeev Wagh
Thu, 26th March 2026 Lecture 34: Pandas [Pandas-Part2]
  • Ch 7 - Sanjeev Wagh
Mon, 30th March 2026 Lecture 35: Clustering - Kmeans[Slides] [KMeansClustering application] [KMeansClustering - Selecting the number of clusters]
  • Ch 7 - Sanjeev Wagh
Tue, 31st March 2026 Lecture 36: Hierarchical clustering[Slides]
  • Ch 7 - Sanjeev Wagh