techred home > DAT-102 Week 1 stations

Week 1 stations

This station set guidest students through exploring a handful of fundamental data science concepts and exposes them to the possibilities awaiting them in this new and exciting course.

The response sheet for all the stations is located here as a PDF and as a open document text format.

wb_incandescentTabular data gathering

Station Goal:

Add to continue building an evolving data schema used to share information about students in our course

Steps:

  1. Review the data table created by the students last term: Note the kind of information they tried to gather about one another. We want to build a more complete and accessible data table about those in this course. The artifact is useful, but we can improve on its quality.
  2. Using paper, tape, and scissors, each person in your group should decide on a new column or two of data to collect about each future station goer. Be as creative and interesting as you can! Once you have decided on a question for the class, choose a data type for the table based on the available types in a Libre office calc. This could be something like "the last date you used a spreadsheet" and the second column could be text for "What was your spreadsheet about"
  3. Create an entry for your column in the associated data dictionary table which is printed on colored paper and located at your station. The purpose of this document is to support accurate data insertion into the info table.
  4. Once you have design your column and made a data dictionary entry, populate all existing columns of data with info about yourself!

Station 2: Primary data gathering

One aspect of our Data analytics course involves gathering and analyzing primary data about the world that's interesting to us. Next week, we'll be learning about descriptive statistics for quantitative variables. To do so, we're going to gather data about a topic of our choosing using a small survey tool we call a "strip survey" which contains two questions: A "spectrum" question which asks the respondent to mark an X along a line of arbitrary length to indicate their preference between two extremes. The second question we'll call a "slicer" and is a categorical question which will be used to segment the responses to the spectrum question and facilitate comparison across groups.

  1. Review the sample Strip surveys from past terms. Read each carefully and consider the degree to which the slicer question divides respondents into groups that might logically respond differently to the spectrum question. Record your assessment of the past surveys on your response sheet in questions 2A and 2B.
  2. Choose a topic for your spectrum question. Decide on the question and carefully choose two labels for your spectrum terminus points. Try your best to design a question that is NOT phrased in a leading way, but instead allows respondents to express their own desires as unbiased a way as possible. For example, an inappropriate way to ask a spectrum question might be: "How outrageous do you think current immigration policies are in the USA?" Rephrased you might ask: "Indicate your level of approval with current US national immigration policy between Strongly Disapprove and Strongly Approve:"
  3. Once you have a draft of your question, review that question with your peers at the station. Ask them for feedback on its design and unbiased wording.
  4. In a simple text editor on a computer of your choosing, create your strip survey and include your FIRST NAME ONLY at the top of each one. Duplicate the survey such that you can print 2-4 per single sheet of letter paper. Use copy and paste functions to do the duplicating.
  5. Upload your completed strip survey document to our shared OneDrive location.
  6. If time, print off 26 total copies of your survey. Chop them up and prepare to distribute them next week.

Station 3: Domain knowledge

A chunk of data analysis competencies are rooted in command of field- or subject-specific knowledge. For example, Predicting changes in bus ridership due to economic shifts are made best by those who know transportation the best.

For this station, review and reflect with your station mates these two domain-specific data articles. Record responses to the questions in your station guide.

  1. Visit this site exploring data in crime prediction algorithms. Summarize the findings from this study in response 3A. Take moment to discuss with your group what data science has to contribute to this discussion.
  2. Compare this data-related study to this one, by Mark Egge, who analyzed bus bunching patterns in Pittsburgh. What did Mark's study find?
  3. Questions 3B - 3D: How are the applications of data science different in these two studies? What standards of data reliability or integrity exist? How are they different?
  4. Do some internet searching to find an article about topics related to a data domain of your own interest. When you have found the article, create a new document in this shared directory and include a link to your article along with a source, a description, and a question for discussion. Include your first name in the file.

Station 4: Gapminder Web!

In this station, you'll have the chance to explore a data visualization tool that has shaped the development of many online data visualization tools.

  1. Visit the GapMinder web app. Devote about 5 minutes to just exploring the interface, tinkering with trends through time, etc.
  2. The default chart shows income versus life expectancy by country. View the timeline video since 1800 with your team. Make a list of the factors that contribute to the major shifts and jumps seen by various country groups in this visualization.
  3. Prioritize with your group the three factors which make this tool one of the most successful visualization endeavors of all of the internet.
  4. Visualization babies per woman versus Income on the X axis. Discuss with your group the trends since 1800. 4A) What outliners exist? Why do they exist? How could the data be misinterpreted?
  5. Explore the several other visualization modes other than "bubbles" such as "maps" and "trends". Which mode is the least intuitive for the lay user? Most intuitive?
  6. 4B) What data sets would you like to see added to the GapMinder tool? What relationships might be interesting to explore?

Station 5: Final project review

    1. 5A) Which project concerns a topic that is in line with your own interests professionally? Why?
    2. 5B) Which project contains a graph that you found particularly easy to interpret? What made it so?
    3. 5C) Which project's data figure was NOT clear to understand? Why? Write a suggestion to this student for how to improve their display?

Station 6: Grading system review

Our course is based on non-traditional grading policies which aren't based on points, but rather projects and self-accountability.

  1. Open the technology rediscovery grading policy overview and dedicate a few minutes to reading about the approach.
  2. Now read through each of the submitted final grade cards from Spring of 2019's Data 102 class. What classroom values can you glean were important last term? Which ones do you believe in? What deductions can you make about what outcomes are important based on these cards?
  3. Access the master grade record of all submitted grades to CCAC by Eric Darsow. If you know how to do so, download or clone a copy of this read only spreadsheet for your own tinkering.
  4. Respond to the questions on your response guide for station 6.