data structures header
techred home > DAT-102 Data structures and formats

wb_incandescentIntroduction

Data! Data! It's everywhere! We need a classifcation scheme for the amorphous notion of data that pervades society today. These stations will explore a classification scheme currently under development for the CCAC Data Analyatics program:

Notes

  1. Many classification schemes for data depict quantitative and qualitative data as the head nodes of two very separate branches of data which generally shun one another and do not play nicely to one another. The above figure depicts that both qualitative and quantitative data can be structured in any of the topologies (sets, graphs, etc.). As computerized tools allow us to transmit and transform data into so many formats, increasingly tools are supporting the investigation of qualitative and quantitative data side-by-side.

wb_incandescentStation 1: Binary

Objective:

Encode and decode latitude and longitude values in base 2 (aka binary) form and base 10 form to internalize the concept that all data we process with computers must, somehow, be converted into a binary sequence

Steps:

  1. Open Google Earth in Chrome (it must be the Chrome browser only!)
  2. Open the tools for noobs base converter
  3. Open the Computer Number Format article in The Wikipedia
  4. Dedicate 2-4 minutes to review the basis of geospatial coordinate systems on The Wikipedia
  5. Adjust the coordinate display units in Google Earth to decimal. Do this by clicking the hamburger icon in the upper right >> settings >> latitude/longitude formatting >> select "Decimal"
  6. Locate an interesting place on the planet that you have some interest or connection to
  7. Navigate to this place on Google earth. With your cursor resting exactly on top of your desired location, read the latitude and longitude from the lower right corner of the Google Earth screen. See figure 1 below. Transfer this data to your station worksheet.
  8. Tinker with the Ambrsoft base converter you opened above and see how you can type in a number (without a decimal point) into any box and the value will be automatically converted to all the other available bases as you go. You'll need to convert the Lat and Lon values before and after their decimal points separately, for a total of four conversions. Write the coordinates in Binary form on a 3x5 card with a hint about where the location is. The next group will try to decode your location and place a point on the paper global map.
  9. Once you have posted a binary version of a location for others, try decoding somebody else's clue: Type in the binary representation of each component (before and after the decimal for both lat and lon) and extract the base 10 version.
  10. Use the base converter to to get the binary into decimal. With a lat and lon in decimal form, you can type that ordered pair into the Google Earth search feature.
  11. If your conversion doesn't work, make a note about how far you got and what you tried.
  12. Fill your remaining time with an exploration of The Wikipedia articles linked above.

Figure 1: Lat and lon in google earth

google earth

wb_incandescentStation 2: Graphs

Goal:

Demonstrate command of the concept of an undirected graph by creating a visual and a tabular representation of a graph-based data structure.

Steps:

  1. Dedicate 3-6 minutes to study the essentials of mathematical graphs using this Wikipedia article
  2. Visit littlesis.org's sample network chart, study that sample, and then look at a few others by navigating to Explore >> Maps
  3. Once you have read about the organization and what kind of data it assembles, do a search for a person of interest and fashion a graph data structure to represent that person's connections to other people and organizations. You can find people using Explore >> Lists or doing a straight search. Since each connected entity is also a link, you can create a web of relationships.
  4. Studying the samples provided at your station, create a tabular representation of this data, showing which nodes are connected to which other nodes. Think about what data type you would declare each column of your tabular version of the network graph.
  5. Post a copy of your graph on the cork board nearest this station for others toa dmire.

wb_incandescentStation 3: Trees

Goal:

Create hierarchical representations of organizational relationships and demonstrate how trees can also be encoded in plain text for reading into and out of a computer

Steps:

  1. Create an organizational chart of a company or group you've worked with in the past. Include a "root" node and child nodes with appropriate connections showing their institutional chains of reporting and control
  2. Study the Wikipedia article on tree traversals

wb_incandescentStation 4: Table Formats

Tables in their simplest form consist of rows and columns of data which spreadsheets can ingest happily. Since so many programs input and output tabular data, we can use a common method of exchange called a CSV file, which stands for comma separated values which is outputted and inputted by almost all spreadsheet tools in existence today.

Steps:

  1. Dedicate about 5 minutes to reading the overview paragraphs and then the Basic Rules section about CSV files on the Wikipedia.

wb_incandescentStation 5: Mapping key-values

Under construction! Stay tuned.

wb_incandescentStation 6: Qualitative goodness

Qualitative data encompasses text, images, and audio which cannot be immediately represented with quantitative tools. In this exercise, we'll attempt to convert some qualitative data into quantitative data which can then be visualized.

Steps:

  1. Choose a speech by a public figure who you care about or at least are interested in.
  2. Find a link to the speech online and print the text out so we can mark it
  3. Open this google doc for coding qualitative data. Study the existing columns and populate it with a speech of your choosing. Carefully count and code the requested data.
  4. Then add a column or two of your interest to the table.