Data Science - Ingeniería


Lugar: Campus Universitario Pontificia Universidad Javeriana - Bogotá, Colombia.


We welcome you to the courses in Data Science and Design Thinking that will be offered by a world recognized scholar, Professor Jennifer Widom at Javeriana University. Jennifer Widom is the Frederick Emmons Terman Dean of the School of Engineering and the Fletcher Jones Professor in Computer Science and Electrical Engineering at Stanford University. More information about her can be found at: https://profiles.stanford.edu/jennifer-widom We aim to have equal-gender participation and students coming from the regions of Colombia in both courses.

The 3-day in-person course in Data Science is designed mainly for excellent engineering students who have some already acquired skills in programming. The 1-day in-person course in Design Thinking aims both advanced high school students as well as engineering and design students. To postulate students will be required to have been registered officially as a students in the most recent academic period (first semester of 2022) and to present a letter of recommendation from an academic authority of the institution where students are registered. For the course in Data Science, please note that students from all universities in Colombia are welcomed, but consider that we will guarantee 20 seats to schools of engineering who are members of ACOFI (please contact ACOFI for that purpose). In any case, These courses represent a unique opportunity not only to learn, but to interact with students from different institutions and regions of Colombia. We hope with these two wonderful courses that we can contribute to the construction of a strong engineering ecosystem in Colombia. Enjoy!


Lope Hugo Barrero
Dean School of Engineering
Pontificia Universidad Javeriana

Jennifer Widom is the Frederick Emmons Terman Dean of the School of Engineering and the Fletcher Jones Professor in Computer Science and Electrical Engineering at Stanford University. She served as Computer Science Department Chair from 2009-2014 and School of Engineering Senior Associate Dean from 2014-2016. Jennifer received her Bachelor's degree from the Indiana University Jacobs School of Music in 1982 and her Computer Science Ph.D. from Cornell University in 1987. She was a Research Staff Member at the IBM Almaden Research Center before joining the Stanford faculty in 1993. Her research interests span many aspects of nontraditional data management. She is an ACM Fellow and a member of the National Academy of Engineering and the American Academy of Arts & Sciences; she received a Guggenheim Fellowship in 2000, the ACM SIGMOD Edgar F. Codd Innovations Award in 2007, the ACM-W Athena Lecturer Award in 2015, and the EPFL-WISH Foundation Erna Hamburger Prize in 2018.


Course
 

july 25th, 26th, 27th - 2022

Course duration: 24 h | Face to face modality

  • Prerequisites - none
  •  Topics - Data-driven applications and services; brief introduction to data manipulation and analysis, data mining, machine learning, data visualization, data collection and preparation; pitfalls: correlation and causation, underfitting and overfitting, privacy, and others; brief introduction to languages, systems, and platforms for working with data
  • Length – 1.5-2 hours
  • Style - Prof. Widom presentation with audience Q&A

  • Prerequisites - O1
  • Topics - Manipulating and analyzing data using spreadsheets including pivot tables
  • Length – 2-2.5 hours
  • Style - Students work along with Prof. Widom and work on assigned problems
  • Software - Google Sheets

  • Prerequisites - S1
  • Topics - Data visualization motivation; spreadsheet bar charts, pie charts, scatterplots, maps
  • Length – 1.5-2 hours
  • Style - Students work along with Prof. Widom and work on assigned problems
  • Software - Google Sheets

  • Prerequisites - V1
  • Topics - Tableau bar charts, pie charts, scatterplots, packed bubbles, maps; Tableau dashboards; publishing interactive visualizations
  • Length – 1-2 hours
  • Style - Students work along with Prof. Widom and work on assigned problems
  • Software - Tableau Public

  • Prerequisites - O1
  • Topics - Introduction to relational database management systems (RDBMS); relational data model; creating and loading data; basics of SQL query language
  • Length – 1.5-2 hours
  • Style - Prof. Widom presentation interleaved with students working on assigned problems
  • Software - SQLite relational database system via Google Colab

  • Prerequisites - D1
  • Topics - More advanced SQL constructs (aggregation, subqueries, data modification, and others); coverage configurable to available time
  • Length - 1-2 hours
  • Style - Prof. Widom presentation interleaved with students working on assigned problems
  • Software - SQLite relational database system via Google Colab

  • Prerequisites - O1
  • Topics - Introduction to Python; manipulating data in Python; plotting in Python; Pandas package
  • Length - 3-4 hours
  • Style - Prof. Widom presentation interleaved with students working on assigned problems
  • Software - Python via Google Colab

  • Prerequisites - O1
  • Topics - History of data mining; market-basket data; frequent item-sets; association rules
  • Length - 1 hour
  • Style - Prof. Widom presentation with audience Q&A

  • Prerequisites - M1, D2
  • Topics - Computing frequent item-sets and association rules using relational databases and SQL
  • Length - 1-2 hours
  • Style - Prof. Widom presentation and students work on assigned problems
  • Software - SQLite relational database system via Google Colab

  • Prerequisites - M1, P1
  • Topics - Computing frequent item-sets and association rules using Python
  • Length – 1.5-2 hours
  • Style - Prof. Widom presentation and students work on assigned problems
  • Software - Python via Google Colab

  • Prerequisites - O1 required, S1 recommended but not required
  • Topics - Regression introduction and applications; simple linear regression; regression and correlation; regression shortcomings and dangers; polynomial regression
  • Length – 1.5-2 hours
  • Style - Prof. Widom presentation and students work on assigned problems
  • Software - Google Sheets

  • Prerequisites - O1, L1 recommended but not required
  • Topics - Introduction to classification; k-nearest-neighbors; decision trees; Naïve Bayes classifiers
  • Length – 1 hour
  • Style - Prof. Widom presentation with audience Q&A

  • Prerequisites - O1, L2
  • Topics - Introduction to clustering; k-means
  • Length - Less than 1 hour
  • Style - Prof. Widom presentation with audience Q&A

  • Prerequisites - P1, one or more of {L1, L2, L3}
  • Topics - Python packages for regression, classification, and clustering
  • Length - 1-2 hours depending on coverage
  • Style - Prof. Widom presentation and students work on assigned problems
  • Software - Python via Google Colab

  • Prerequisites - O1, P1
  • Topics - Manipulating data in R; plotting in R
  • Length - 1-2 hours
  • Style - Prof. Widom presentation interleaved with students working on assigned problems
  • Software - R via Google Colab

  • Prerequisites - O1
  • Topics - correlation versus causation; determining correlation; determining causation
  • Length - Less than 1 hour
  • Style - Prof. Widom presentation with audience Q&A

  • Prerequisites - P1
  • Topics – Modeling networks as undirected and directed graphs; analyzing graph properties; programming using networkx package
  • Length – 1.5-2 hours
  • Style - Prof. Widom presentation and students work on assigned problems
  • Software - Python via Google Colab

  • Prerequisites - P1
  • Topics – Text analysis & natural-language processing; image analysis
  • Length – 1.5-2 hours
  • Style - Prof. Widom presentation and students work on assigned problems
  • Software - Python via Google Colab