home
techred home > data anlaytics home

Course concept progression

The following table maps course session dates, lesson topics, references, and content links for ATE-252, a cross-disicplinary exploration of issues in transportation analytics.

course date wk no. session links learning objectives out-of-class work
ATE-252 Tue
28-Jan-2020
1

Introduction to data analytics

ATE-252 Tue
8-Feb-2020
2

Tales from the real world:
Working in transportation anlaytics

Discussion with Mark Egge of High Street Consulting

Course updates

Electricity intro and sensor articles

Shared by Trish!

ATE-252 Tue
15-Feb-2020
3

Exploration group work day I

ATE-252 Tue
22-Feb-2020
4

Exploration group work day II

ATE-252 Tue
29-Feb-2020
5

Review workgroup progress

ATE-252 Tue
7-March-2020
6

Second round of work group time I

ATE-252 Tue
14-March-2020
-

CANCELLED!

Incident to viral transmission reduction precautions.

ATE-252 Tue
21-March-2020
-

Rescheduled "spring break"

ATE-252 Tue
28-March-2020
9

MtngID: 614 961 8122
Ph:+1 646-558-8656

Reworking our course

Since it's unlikely we'll have in-person meetings again, we need to decide on a path forward for the remainder of the course. Ideas to consider:

  • Working independently with hardware kits that Eric could prepare and send out after 4 April
  • Scrap hardware all together and fall back on a more traditional capstone project related to transportation data analysis
ATE-252 Tue
4-April-2020
10

MtngID: 614 961 8122
Ph:+1 646-558-8656

Project Setup

  1. Pin down a primary dataset (fallback is the PA crash data)
  2. Identify a "first pass" inquiry question
  3. Create a cause-effect flow chart which identifies your outcome variables and indepenent variables
    • Investigate the type and variation range of each variable of interest
    • Do we have enough data reported?
    • What would be the "analysis data type" -- meaning, even if original data set type is Text, what would I anticipate the end-variable type becoming after cleaning and transforming? (Use open refine to clean/code)
  4. Install Open refine
  5. Watch these three videos (1, 2, 3) and start playing with Open Refine.
ATE-252 Sat
11-April-2020
11

Cleaning data in open refine

Eric: Map students to JVMs

  1. Attempt to read in full crash data into OpenRefine: works--start transforming columns per your question
ATE-252 Tue
18-April-2020
11

MtngID: 614 961 8122
Ph:+1 646-558-8656

Thanks to Jill for saving the day...

  1. Use the online filter mechanism on the PA crash data home to extract a subset of the 2m rows relevant to your study. To get there, open the crash data home then click the gray box "View Data" in the upper right corner, then an interactive tool is loaded, and you can choose Filter and add conditions by column.
  2. Create and export a subset specific to your inquiry question and try loading it in OpenRefine and/or python for basic descriptive stats.
  3. Remember to increase your memory allowance on open refine with this tutorial
ATE-252 Tue
25-April-2020
12

Wrangling and Visualization

ATE-252 Tue
2-May-2020
13

Building a tool chain in R with Mark Egge

Screen cast of whole class uploaded to YouTube and the accompanying code on Mark's git repo with lovely readme

Exploratory data analysis

Use pandas to create a DataFrame of only your most relevant variables, trimming down rows using .loc[] as needed.

Create histograms using .hist(), and simple plots of your core variables, saving the output to a notebook or raw image files to your project repo.

ATE-252 Tue
9-May-2020
14

MtngID: 614 961 8122
Ph:+1 646-558-8656

Share final projects with CCAC open house