This week's class will focus on data organization and management for GIS applications to anthropological research. Specifically, we will spend time discussing the importance of relational databases and means of organizing and querying data organized into related tables.
I will also describe approaches to indexing data for anthropological fieldwork. The ID number system at various spatial scales and methods for linking tables between ID number series.
In lab this week we will focus on a-spatial data management techniques in ArcGIS. Joining and relating tables is are powerful tools, but it also creates a structure that can make queries and representation more difficult.
Download Callalli Geodatabase.
Look the data in Arcmap.
Display. Turn off ArchID_centroids.
Discuss file formats: coverages, shapefiles, GDB. Raster / GRID.
Problem 1:
Ceramics Lab results. Cleaning up the database logic.
Working at the artifact level of analysis the ID number system is problematic because you cannot refer to a single artifact from an index field.
What are some solutions?
1. Open the Ceramics_Lab2 attribute table.
Do you now have a six digit ID#? That’s your new unique ID number that allows you to refer to individual artifacts.
2. Cut off the “diameter” measure for more efficient design
This is somewhat extreme "database normalization", but you can see the principal behind it.
Other selection methods. Try selecting data in the Ceram_p (not the lab file) file using Select by Attributes. The text you enter in the box is a modified SQL query. Try to select all of the Bowls from the LIP period and look at the selection on the map.
Problem 2:
Show relationship between size the archaeological site (m2) and proportion of obsidian artifacts in the lab2 analysis.
The basic problem here is that there are MANY fields in each site. In order to represent the data on the many site we need to collapse the contents of the Many table into units so the One table can symbolize it.
Steps:
1. Problem. You’ve explored the data in Site_A and in Lithics_Lab1 and you recognize that you have to collapse the values in Lithics_Lab1 before you can show them on the map. You can collapse table values with a tool called Pivot Tables and then by Summarizing on the SiteID column.
First, make sure there are no row selected in the "Lithics_Lab1" table.
Open the Pivot Tables function in the toolbox under Data Management > Tables > Pivot Tables
2. Pivoting the data table. In the Pivot Tables box select “Lithics_Lab1” as your pivot table. This is the table that has the quantity of data we want to reduce. We are effectively swapping the rows and columns in the original table.
You can use the default filename. It should end up in your geodatabase. Have a look at the contents of the output table. Note that there many SITEID values that are NULL. That is because there were artifacts collected outside of sites in some cases. Note, also that the MATERIAL TYPE values have become column headings (Obs, NotObs), but there are still duplicate sites in rows as you can see in the SITEID column.
Look at the resulting table. Is there only one SITEID value per row? Why are there so many with a <NULL> SITEID value? Where did that data come from?
3. Representing the data. Now that the results for each site is found on a single line you can Join it to the Site_A feature.
Look at the resulting table. Does it contain all the information you need collapsed onto rows? Look at the number of records. There were 88 in the original Site_A table, now there are fewer (or there will be if you update by scrolling down the table). Why is that? The original question asked about the relationship between site size and presence of obsidian. How are you going to measure site size?
4. Symbolizing the data. You can show these data using a number of tools. For example, for a simple map you could divide the weight of obsidian by the weight of non-obsidian per site and symbolize by the ratio of obsidian over non-obsidian.
Here, we will do something slightly more complex and use a pie chart to show the relationship with respect to size.
As mentioned in class, next week we will focus on methods of bringing data into GIS. Which project area do you plan to work in? We will begin constructing acquiring data (imagery and topographic data) next week so please decide on a project area by then.
You will need Latitude/Longitude coordinates in WGS1984 for the four corners of a bounding box around your study area. An easy way to get these values in in GoogleEarth. You'll want to pick a region about the size of a County in the US for this assignment.
Write down these four corner coordinate values. We'll use them in class next week.