California Community Colleges: Student Transportation and Carbon Emissions

California’s system of community colleges is the largest system of higher education in the world; it is comprised of 110 college campuses and over 2.5 million students. Each day millions of students make the daily commute to the campuses. This commute impacts the students in the form of time spent commuting as well as the monetary cost of owning and driving an automobile. The commute also has an environmental impact upon the local community. This project focuses on one of those environmental impacts, the carbon emissions from the automobiles. The impacts upon the students and the community are directly linked to the distance traveled. By performing network analysis within a geographic information system (GIS) to estimate the distances which students travel to campus it was possible to provide estimates of the daily impact of the commute upon the students and the communities through which they travel.


Chapter 1 -Introduction
This project was primarily concerned with the factors and impacts related to community college students in California commuting to campus. Section 1.1 introduces the client, Mr. John Roach of the Foundation for California Community Colleges. Section 1.2 provides a detailed statement of the problem. Section 1.3 presents the proposed solution along with the goals and objectives of the project, the scope of the project, and the methodology that was used in completing the project. Section 1.4 identifies the primary audience of this report. Section 1.5, the final section of this chapter, provides an overview of the rest of this report.

Client
The client for this project was the Foundation for California Community Colleges, a nonprofit organization that supports California's community colleges with programs and services that offer benefits to many diverse elements within the college system. Mr. John Roach, the Director of Systems Analysis and Research at the Foundation, served as the Project Sponsor and was the point of contact for all issues pertaining to this project. Mr. Roach is involved with the California Community Colleges Geographic Information Systems Collaborative (CCCGIS), a team comprised of members of multiple organizations that promotes the employment of geospatial information to provide answers and insight to California's community colleges on many issues.

Problem Statement
Every day millions of students who attend one of the 110 community colleges in California travel by personal automobile to and from campus. This has an impact upon both the students and the communities through which travel. The impact upon the students is in the form of cost of travel in both time and money, while the impact on the communities is primarily environmental in the form of carbon emissions from the automobiles.
The impacts due to commuting to a campus are directly related to the distance traveled by the student. By estimating the total vehicle miles traveled per campus, it is possible to provide state-wide estimates of the cost of commuting to students, transportation time, and carbon emissions for each campus and district. The Foundation for California Community Colleges envisioned this project as a way to help raise awareness of the many impacts of student commuting and to provide estimates that will support planning and research in the future.

Proposed Solution
The purpose of this project was to estimate the factors and impacts related to students commuting to community college campuses by car. These estimates were calculated by using student enrollment data by ZIP code, other applicable spatial data, and geographic information system (GIS) analysis tools.

Goals and Objectives
The primary goal of this project was to estimate multiple factors associated with student vehicle transportation for each of the 110 community college campuses that comprise the California Community College System. The factors that were estimated include vehicle transit distance from home to campus, student transit time, student transit cost, and vehicle carbon emissions from home to campus.
The Foundation for California Community Colleges has indicated that its objective for this project is to raise awareness of the impact upon both the student and the community of commuting to and from campus by means of private automobile. The estimates that were presented as the outcome of this project are intended to support future planning and research regarding student transportation.

Scope
The scope of this project was limited to providing estimates of the factors and impact of student transportation related to commuting to the campuses of the California Community College system. The limited scope was dictated by the combination of the data available, the assumptions that will be made, and by the time available for work on the project. The limited scope of providing estimates falls in line with the requirements and objectives set forth by the Foundation for California Community Colleges.
The majority of the data was provided by Mr. Roach of the Foundation for California Community Colleges. Mr. Roach provided Fall 2007 student enrollment by ZIP code for each of the 110 California Community College campuses, California Community College campus polygons and centroids, and California Community College District polygons. Other readily available data that were acquired were ZIP code polygons and centroids for the State of California, and road network data for the State of California. Both the ZIP code data and the road network data were created by Tele Atlas North America, Inc. and are licensed to ESRI.
This project made several assumptions to provide the required estimates of transportation related factors. It was assumed that the ZIP code polygon centroid was the starting point for the commute to campus, while the campus polygon centroid served as the destination point. With a student population of over 2.6 million, this assumption was deemed reasonable enough to provide an estimate.
This project utilized ArcGIS 9.3.1 to perform the analysis that estimated the vehicle miles traveled and travel time related to commuting to campus. The other factors were estimated using available resources such as a carbon dioxide emissions calculator at Carbonify.com and a driving cost calculator at CommuteSolutions.org. At

Methods
A traditional project development methodology was utilized in order to complete the tasks that were required to undertake this project and present the requested estimates of student transportation related factors to the Foundation for California Community Colleges. This methodology was deemed the most appropriate approach for this project due to the nature of the goals and objectives. It was applicable to look at this project as a straightforward data collection and analysis project since it was intended to be used in order to present findings in the form of estimates and was not being used to create a business application or process.

Audience
This project report is written for employees of the Foundation for California Community Colleges and of the community college system. Some who are familiar with geographic information systems (GIS) and those who are not. Mr. Roach and other members of the California Community Colleges Geographic Information Systems Collaborative (CCCGIS) are professional GIS users. However, other employees of the Foundation for California Community Colleges and the community college system may not have had any prior exposure to GIS.

Overview of the Rest of this Report
A literature review concerning the background of the problem being addressed by this project is presented in Chapter 2. Chapter 3 covers system analysis and design, while database design is discussed in Chapter 4. An in depth discussion of the implementation of this project is presented in Chapter 5. The results and analysis of the project can be found in Chapter 6, and the conclusions and recommendations for future work are found in Chapter 7.

Chapter 2 -Background and Literature Review
This chapter introduces the two major topics of concern for this project. Section 2.1 provides an overview of estimating the costs involved in transportation, and presents a similar project which was completed for Park University in Missouri. Section 2.2 discusses the issue of carbon emissions and why it is important to address it. Section 2.3 provides a summary of what was gained in reviewing the literature.

Estimating Costs of Transportation
The intention of this project was to provide answers to questions regarding the costs of students commuting to the campuses of the California Community College system. The Santa Cruz County Regional Transportation Commission provides a summary of the overall costs of transportation by motor vehicle on their web page, The True Cost of Driving. While it is common to only take fuel prices into account when thinking about the cost of transportation, it is not the only one (Santa Cruz County Regional Transportation Commission [SCCRTC], 2009). Others include the costs to "buy and maintain a car, including tune-ups, oil and tires, as well as for insurance, registration, and parking. Indirect costs of driving, such as road construction and maintenance, add to drivers' financial burden through taxes and fees" (SCCRTC, 2009). It is also important to look at impact of on the local community or society as a whole, including: air pollution, traffic congestion, and health care. When commuting to campus, the students may be faced with many travel routes from which to choose. Makin, Healey, and Dowers (1997) propose that the decisions that commuters make regarding the methods and routes taken are influenced most by travel time. For the purpose of modeling commuter travel, Gordon, Richardson, and Jun (1991) suggest that travel time, as well as the physical layout of the route, are the most important factors. However, Hanson and Huff (1988) state that travel decisions are not based exclusively on travel time or distance, but may also take into account other factors such as required stops, or the choice of a safer or more aesthetically pleasing drive.
According to Blumen and Kellerman (1990) many social scientists have relied upon Euclidean distances to estimate travel distances, proposing that the relationship between the two is similar enough to provide a reasonable estimation. However, for the purpose of estimating travel time, Thériault, Vandersmissen, Lee-Gosselin, & Leroux (1999) state that the estimates calculated from Euclidean distances are not precise enough to be used. Therefore, distances along road networks were used for this project instead of Euclidean distances.
A similar study was completed at a much smaller scale for the single campus of Park University, located in Parkville, Missouri. The outcome of that study was an estimation of the amount of emissions due to commuting students, faculty and staff by using the travel distance based upon their ZIP codes (Jones & Hageman, 2008). Using ZIP codes to estimate the individual's travel distance for the Park University study is the same approach used for this project. However, along with estimating vehicle miles traveled and vehicle carbon emissions, this project also estimates transit time and transit cost. Jones and Hageman stated that it is not simple to calculate the emissions from motor vehicles and that there are various other factors effect vehicle emissions. This project aims to present a simple estimate of the average carbon dioxide emissions and will make use of a carbon dioxide calculator, while the Park University study also calculated the amounts of other pollutants that comprise vehicle emissions. According to the United States Energy Information Administration (2008), the amount of vehicle emissions is directly linked to the amount of vehicle miles traveled.

Carbon Emissions
There is growing concern globally about the effects of the ever-increasing dependence upon motor vehicles on the environment. Carbon dioxide, a greenhouse gas, is a major component of vehicle emissions. There is much interest in how vehicle carbon emissions are contributing to overall carbon footprints and how they are contributing to a potential climate change. "CO 2 [carbon dioxide] emissions of recent years have grown at the highest rates ever recorded, an observed trend incompatible with stabilizing atmospheric concentrations of greenhouse gases and avoiding long-term climate change" (Quadrelli & Peterson, 2007, p.5938). Price, et al. (2006) state "transportation is the fastest-growing source of CO 2 emissions globally" (p. 21), and according to Glaeser and Kahn (2008), households and vehicles are the source of close to 40 percent of carbon dioxide emission. According to Quadrelli and Peterson, the use of fossil fuels is the biggest factor leading to potential climate change, with 80 percent of greenhouse gases attributed to it. It is suggested by Newman and Kenworthy (1999) that urban areas continue to see an increase in the reliance upon motor vehicles and that there is an increasing concern regarding this dependence due to greenhouse gases attributed to vehicle emissions. According to the United States Energy Information Administration (2008), "transportation sector carbon dioxide emissions in 2007 were 431.8 million metric tons higher than in 1990, an increase that represents 44 percent of the growth in unadjusted energy-related carbon dioxide emissions from all end-use sectors over the period" (p. 19). However, the United States Energy Information Administration also reported that carbon dioxide emissions attributed to transportation in 2007 have remained at approximately the same level as those in 2006 due to an increase in gas prices and a slowdown in the economy.
In another similar project, Conway, et al. (2008) described the development of an ecological footprint for the University of Toronto at Mississauga. The results of the calculations revealed that after on-site energy consumption, student commuting was the next largest factor in the campus' footprint. This revelation gives credence to the importance of understanding the commuting habits of students and how they relate to vehicle carbon emissions. Furthermore, Toor (2003) states that "the daily movement of people back and forth to campus in automobiles burning fossil fuels is one of the largest impacts a typical educational institution imposes on the life-support systems of the planet" (p. 131).
The trends in greenhouse gas emissions in the state of California closely follow the overall trend of the United States as a whole. "The transportation sector is the single largest category of California's GHG [greenhouse gas] emissions, producing 41 percent of the state's total emissions in 2004. Most of California's emissions, 81 percent, are carbon dioxide produced from fossil fuel combustion" (Bemmis, 2006, p. ii). According to Glaeser and Kahn (2008), California is the state with the areas of lowest emissions. While Bemmis reports, "as the second largest emitter of GHG emissions in the United States and twelfth to sixteenth largest in the world, the state contributes a significant quantity of GHGs to the atmosphere" (p. i). Bemmis also states that the amount of greenhouse gas emissions in California is increasing. The California Community Colleges system is interested in learning how students commuting to the various campuses are contributing to the increase in greenhouse gases in the form of vehicle carbon emissions. The findings from this project provided estimates that can be used to support planning and research into ways to possibly reduce the amount of carbon emission attributable to the California Community Colleges system.

Summary
Through review of literature topics related to this project, an understanding of the importance of addressing the impact of student transportation was gained, as well as the approach to be used. It is evident in the literature that transportation-caused carbon emissions are a growing problem that must be dealt with. This project is a first step for the California Community Colleges system to deal with the problem. Insight was also gained through review of the literature concerning estimating costs of transportation on the methodology that dictated the system design that was used in completing this project.

Chapter 3 -Systems Analysis and Design
This chapter introduces the project requirements, the system design, and the project plan. Section 3.1 re-introduces the problem statement for this project. Section 3.2 is concerned with the requirements of this project and contains two subsections: 3.2.1, which presents the functional requirements, and 3.2.2, which presents the non-functional requirements. Section 3.3 briefly describes the system design, and section 3.4 is an overview of the project plan. Finally, section 3.5 provides a summary of the chapter.

Problem Statement
Every day millions of students who attend one of the 110 community colleges in California travel by personal automobile to and from campus. This has an impact upon both the students and the community in which they are traveling. The impact upon the students is in the form of cost of travel in both time and money, while to the community it is the overall carbon emissions that can be attributed directly to student travel.
The impacts of commuting to a campus are directly related to the distance traveled by the student. By estimating the total vehicle miles traveled per campus, it is possible to provide state-wide estimates of the cost of commuting to students, transportation time, and carbon emissions for each campus and for each district. The Foundation for California Community Colleges envisioned this project as a way to help raise awareness of the many effects of student commuting and to provide estimates that will support planning and research in the future.

Functional Requirements
As stated in the goals and objectives section of this report, the purpose of this project was to provide estimates of vehicle miles traveled, travel time, transit cost, and vehicle carbon emissions related to community college students commuting to campus. The functional requirements of this project were determined by the intended purpose of this project (Table 1). The function of providing an estimate of vehicle miles traveled by students between their homes, which is being represented by the delivery-based centroid of their listed ZIP code, and the campuses that they attend was not only a requirement on its own but was necessary in order to perform the calculations that resulted in the fulfillment of two of the other requirements. The travel time estimates were calculated by network analysis at the same time as the vehicle miles traveled, while the transit cost and vehicle carbon emission estimates were calculated by multiplying the vehicle miles traveled by the appropriate constants.
The client's requirements for this project were a set of maps which detailed the four estimates that were calculated at both the campus and the district level, as well as a tabular summary of the findings.

Non-Functional Requirements
Do to the nature of this project, which was completed by performing analyses and reporting the results, the non-functional requirements were concerned predominately with the software requirements and the hardware requirements (Table 2). In order to run the Network Analyst extension a single use or concurrent use Network Analyst license is required. Along with a Network Analyst license an ArcView, ArcEditor, or ArcInfo is necessary.
The sole hardware requirement for this project was a computer with the capability for running the required software. This requirement was met by using a Dell Precision M4400 laptop computer

System Design
The system design required for completing this project was a file geodatabase which will be described in more detail in the next chapter, and the GIS software.

Project Plan
The traditional project development methodology, as stated in the methods statement of this report, was used in completing this project and consisted of seven major tasks: specify the requirements, acquire and analyze the data, construct a geodatabase, perform analysis, prepare the deliverables, submit draft report, and provide the final deliverables to the client.
Specifying the requirements of this project was the first task that was completed. This step included conversing with Mr. Roach of the Foundation for California Community Colleges in order to determine what the objectives of this project were, what data were available, what support could be expected, and what the requested deliverables were.
The next task scheduled was to acquire and analyze the data. Mr. Roach supplied enrollment data for Fall 2007 by ZIP code for each of the 110 community colleges, polygons of each of the campuses along with their centroids, and polygons of each of the districts. Other data, including ZIP codes polygons and their delivery based centroids in the state of California and bordering areas, and road networks covering the same areas, were acquired from readily available sources. Once all the required data were received they were analyzed to determine if they were correct and complete.
After the data were determined to be correct and complete, the next task was the geodatabase construction. The geodatabase was comprised of campus polygons centroids, district polygons, ZIP code polygons and centroids, the road network, and all applicable tables.
With the geodatabase completed, it was time to perform the analysis. During this step vehicle miles traveled and travel time were estimated by using the centroids of the zip code polygons and centroids of the campus polygons as the respective start and end points, and the road network. The calculation of the other factors that were to be estimated made use of the vehicle miles traveled estimates and the other appropriate resources that were available.
Once estimates for all of the factors for each of the campuses were obtained, the next task was to prepare the deliverables. This step included producing maps, a tabular summary for each campus and district, a report detailing the results of the project, and PowerPoint slides used in a public presentation.
A workflow diagram (Figure 3-1) shows the linear step-by-step method that was used in completing this project. The required tasks followed one after another until the submission of the draft versions of the report. Once the draft was accepted a final version of it and the other deliverables was provided to Mr. Roach. Each task had an estimated time in hours for completion assigned to it; this is presented in Table 3. A Gantt chart (Figure 3-2) shows the planned schedule that this project was intended to follow.

Figure 3-2: Project Gantt Chart
In reality the project schedule differed considerably from the planned schedule due to two unexpected factors. The first factor was the lack of time that was available to devote to project work because of unforeseen academic commitments. The second factor was the delay in receiving the student enrollment data from the client; however it must be mentioned that this delay was not due to any negligence by the client. While it was originally anticipated that the data would be received during the end of April, and some of it was, the entire data collection necessary for this project was not complete until the middle of August. The delay in receiving the enrollment data resulted in a four-month shift in schedule.

Summary
This chapter presented both the functional and non-functional requirements of this project. The nature of this project and its expected outcome dictated the functional requirements, and the lack of any concrete non-functional requirements other than a computer with GIS software loaded on it. This chapter also provided a description of the tasks that were required to complete this project and showed how a projected schedule can easily become invalid.

Chapter 4 -Database Design
In this chapter both the conceptual data model and the logical data model will be discussed. The datasets that made up the geodatabase at the end of this project are introduced, as well as the original datasets that were received from the client or acquired elsewhere. The changes that were made to the original datasets in order to prepare them for use in the analysis portion of this project are also discussed.

Conceptual Data Model
A conceptual data model provides a way to define the project data and their relationships. The conceptual model for this project shows the relationship between the colleges, the students, the ZIP codes of the students, and the commute route from home to campus (Figure 4-1). Each campus, identified by a College Identification number (CID) and a name has many enrolled students, who are identified by Student Identification numbers and have ZIP codes associate with them. Each ZIP code has the number of students residing in that ZIP code associated with it. The ZIP code is used along with the college in order to calculate a commute route. Each college has a commute route calculated for each of the ZIP codes in which its students live. Each commute route has a travel distance, a travel time, a travel cost, and vehicle carbon emissions value associated with it. The travel distances and travel times were calculated when the Network Analyst extension computed the routes. The value for the travel cost and vehicle carbon emissions were derived from the travel distance. The vehicle carbon emissions were multiplied by the number of students living in the ZIP code from which the commute originates in order to present the total vehicle carbon emissions that are attributed to each commute route.

Logical Data Model
An ArcGIS File Geodatabase was created to store all features and tables. The geodatabase contains nine feature datasets and multiple tables. The nine feature datasets are Campus, District, Lines, Points, Polygons, Routes, ServiceAreas, Streets, and Studen_Zip_200; they all use the NAD 1983 California Teale Albers projected coordinate system as specified by the client. The Campus Feature Dataset contains each community college campus centroid as a point feature class using the College Identification number (CID) as the name of the feature class. For example, the CID of Crafton Hills College is 981, so the campus centroid point feature is named Campus_981 (Figure 4-2).

Figure 4-2: Campus Centroids
Also within the Campus Feature Data set is a single feature class containing the centroids of all the campuses, and individual feature classes of the campus centroids by community college region.
The District Feature Dataset contains the polygon feature class that is comprised of the 72 Community College Districts and the polygon feature class that depicts the nondistrict land in the state (Figure 4-3).

Figure 4-3: Community College Districts and non-district land
This feature dataset also contains a line feature class that depicts the boundaries between the districts with simplified lines for cartographic use (Figure 4-4).

Figure 4-4: Community College District Boundaries
The District Feature Dataset also contains feature classes of the districts aggregated to the community college region level.
The Lines Feature Dataset contains a single feature class named StateLines which is a simply the state boundaries and coastlines that was used in symbolizing the maps that were produced. The Points Feature Dataset holds the city point features for California and neighboring states that were also used in map production. Likewise, the Polygon Feature Data set contains state polygons and a polygon representing Lake Tahoe.
The Routes Feature Dataset contains the line feature classes of the commute routes from the ZIP code delivery based centroids to the campus centroids, as well as a line feature class depicting all the streets in the state. Each commute route was generated using the Network Analyst extension. The commute routes are named according to the college destination by CID, for example CommuteRoutes_981 are the routes to Crafton Hills College (Figure 4-5).

Figure 4-5: CommuteRoutes_981
The feature dataset named ServiceAreas contains the lines that were used to map the commuting zones of each campus, along with the buffers of the lines that were used to create the cartographic symbol used on the maps (Figure 4-6). The line feature classes are named so that ZoneLine_981_50 is the line that delimits the zone in which 50% of the closest students to the Crafton Hills College campus live; likewise ZoneLine_981_90 is the line for the zone containing 90% of the students. The buffers are polygon features that use the same naming convention as the line, thus the buffer for ZoneLine_981_50 is named ZoneBuffer_981_50.

Figure 4-6: ZoneLine_981_50 and ZoneBuffer_981_50
The Streets Feature Data set contains line feature classes representing highways and interstates that were used in map production. The final Feature Data Set is named Student_Zip_200; it contains the polygon feature classes that depict the each ZIP code from which students are commuting to each college. Along with the polygon feature classes, there are point feature classes that represent the delivery-based centroids of each ZIP code polygon. A delivery-based centroid is based upon the area where mail is delivered and not upon the shape and size of the ZIP code polygon. This places the centroid in the more populated area of the ZIP code polygon. The feature classes are named according to the CID of the college to which the students from each ZIP code is commuting to. For example, CID_981_ZIP200 and CID_981_ZIP200_Polygons are the feature classes depicting the ZIP codes of students enrolled at Crafton Hills College (Figure 4-7).

Figure 4-7: CID_981_ZIP200 and CID_981_ZIP200_Polygons
The geodatabase also contains several tables. The table titled Campus_Estimates is one of the tabular results of this project. It contains the average vehicle miles traveled, total vehicle miles traveled, average travel time, average travel cost, total travel cost, average vehicle carbon emissions, and total carbon emission for each community college. Similar to the Campus_Estimates table is the District_Estimates table, which records the average vehicle miles traveled, total vehicle miles traveled, average travel time, average travel cost, total travel cost, average vehicle carbon emissions, and total carbon emission for each district. Enrollment is the table that lists every student enrolled in one of the community colleges during the Fall of 2007. It includes a pseudo student identification number, the CID of the college that the student attends, the ZIP code that the student lists as their home ZIP code, whether or not the student is a distance education student or not (all distance education students were removed from the table), the high school of origin for many of the students, the number of contact hours for courses taken for credit, and the number of contact hours for courses not taken for credit. Also include in the geodatabase is a frequency table for each college that lists the number of students from each ZIP code who commute to the college. These tables are titled using the CID of each college, so that Frequency_981 is the frequency table for Crafton Hills College.

Data Sources
The data used in completing this project were procured from two sources: the California Community College GIS Collaborative (CCCGIS), and Tele Atlas North America, Inc. published by ESRI. Much of the original data from the first two data sources (Table 4) were modified before being loaded into the geodatabase; this is covered in more detail in section 4.4.

Data Scrubbing and Loading
All of the data were originally in file formats that had to be changed in order to be incorporated in an ArcGIS Geodatabase. Along with changing the file format of the data, many of the names where changed (Table 5). The datasets listed in the above table are the datasets as they were before any analysis was performed and the datasets were changed further. Many of the datasets were broken into multiple datasets so that the data contained within were relevant for just one of the colleges and named with the CID incorporated into the name. This is described in detail in the next chapter.

Summary
This chapter presented the conceptual and logical data models that were used for this project. The datasets that populated the geodatabase before the analysis and at the conclusion of the analysis were both introduced in this chapter.

Chapter 5 -Implementation
This chapter provides a detailed description of the steps that were taken in performing the analysis that produced the final results of this project. The steps include preparing the data to be used with the Network Analyst extension, performing the network analysis, and preparing the results for presentation to the client. There are numerous ways to go about achieving the same results when using GIS, the steps discussed in this chapter were used in this project.

Preparation for Network Analysis
In order to perform the network analysis it was necessary to have a single feature class for each college that included every ZIP code in which commuting students resided. Creating these feature classes was accomplished in ArcMap with the table titled Enrollment, the ZIP_Points feature class, and the All_Campus_Centroids feature class. First, a relate was established between All_Campus_Centroids and Enrollment using the CID field in both tables ( Figure 5-1). Next, a relate between Enrollment and ZIP_Points was established with ZIP as the base field ( Figure 5-2). With the relates established for all three tables it was simply a matter of selecting a campus in order to have all the relevant ZIP code centroids selected as well ( Figure 5-3). Once a set of ZIP code centroids was selected it was exported to a feature class with a naming convention that included the CID of the college that they were related to. For example the ZIP code centroid feature class for Crafton Hills College is named CID_981. The next step was to add a field to the ZIP code centroid feature classes indicating the number of students residing in each ZIP code area that commute to the relevant college. A table named Enrollment_Frequency was created using the Frequency tool in the ArcToolbox, with the CID and ZIP fields serving as the frequency fields. The resulting Enrollment_Frequency table contained every combination of ZIP code and CID along with the number of times that each combination occurred. The Enrollment_Frequency was then divided into multiple tables, one for each college. This was accomplished in ArcMap by establishing a relate between Campus_Centroids and Enrollment_Frequency. By selecting a CID in the Campus_Centroids feature class, all rows in the Enrollment_Frequency table containing the same CID were selected. With the applicable rows selected they could then be exported to a new table named so that Frequency_981 was the ZIP code frequency table for Crafton Hills College.
The feature class All_Campus_Centroids contained a point feature for each college campus centroid; however for the network analysis step it was felt that it would be best to have one feature class for each college. The Campus_Centroids feature class was divided into individual feature classes for each college campus in ArcMap by selecting each college campus centroid and exporting it to its own feature class, and given a name containing the CID of the college for which the campus centroid is representative of. For example, the campus centroid feature class for Crafton Hills College is named Campus_981.
Many of the ZIP codes reported in the enrollment data were questionable due to the distance between the campus and the student's reported address. It was obvious that students were not commuting from places as far away as the East Coast, but a judgment call was necessary with regards to ZIP codes which were within California yet still at a considerable distance from the related campus. It was decided to remove ZIP codes from the data that were beyond 200 miles from the campus using straight line distance. This was accomplished in ArcMap by first creating 200 mile buffers around each campus centroid and then using the Clip tool to extract all the applicable ZIP code centroids that were within the 200 mile buffer of the campus for which they were related to ( Figure 5-4). The selected ZIP code centroids were then exported to a feature class named so that CID_981_ZIP200_centroid is the ZIP code centroids within a 200 mile radius of Crafton Hills College.

Figure 5-4: 200 Mile Buffer and ZIP Code Points
The final step taken before performing the network analysis was a cleanup of the data by removing unneeded feature classes that were used simply to produce the feature classes that are needed to perform the analysis and produce the resulting products. At this point all the 200 mile campus buffers were deleted, along with the pre-clipped ZIP code centroids. The Enrollment_Frequency table was removed; however the individual frequency tables for each college were retained.

Network Analysis
With the data prepared, the next step was to perform network analysis to determine the quickest routes between the centroids of the ZIP codes listed for the student commuters and the centroid of the campus which they attend. This was accomplished using the Network Analyst extension in ArcMap. It was decided to use the Closest Facility function since it was the easiest way to find the quickest routes from multiple ZIP code centroids to the campus centroid.
Once the Network Analyst extension was enabled the Make Closest Facility Layer tool was used ( Figure 5-5). The network dataset named streets, which is located on the network server and not in the geodatabase, was used as the input analysis network. The network dataset was left on the server rather than imported into the geodatabase since due to the proprietary nature of the data it was not possible to do so. The output layer name was left as the default Closest Facility. The impedance attribute was set to time and the travel from or to facility option was left as TRAVEL_TO. The remaining two options, Default cutoff and Number of facilities to find were left with the default values. Under Accumulators, both Length and Time were selected in order to have the total time and total distance as an attribute of each route that was calculated. Under Restriction, the options that were selected were Pedestrian Walkway and Pedestrian Ferry, and for the Uturn policy option the default ALLOW_UTURNS was selected. Under Hierarchy, the Use hierarchy in analysis box was unchecked since it was found to create routes that were unrealistic due to the hierarchy dictating the use of the interstate highways that would take the student on a much longer trip than if other roads were used. Finally the Output path shape option under the Output options was left as the default TRUE_LINES WITH MEASURES.

Figure 5-5: Make Closest Facility Layer Tool
Once the Closest Facility Layer was created and the Network Analyst window was opened, a campus centroid feature class would be loaded as the facility location and the related ZIP code centroid feature class would be loaded as the location of the incidents. When loading the facility location the Sort Field was set to CID and in the Location Analysis Properties box the Name Field was set to CID as well (Figure 5-6). Likewise, when loading the incident locations the Sort Field and the Name Field were both set to ZIP (Figure 5-7). Once the Facility and the Incidents were loaded and the network solved, the resulting Routes ( Figure 5-8) were then exported to a feature class named with the CID so that, e.g., CommuteRoute_981 held the quickest routes between all of the ZIP code centroids of the students who commute to Crafton Hills College. This analysis was completed all campuses.

Figure 5-8: Closest Facility Routes
In order to make the routes calculated by the network analysis useful for further analysis, it was necessary to modify the attribute table of each route feature class. First a new field was added called ZIP. The ZIP field was populated by using the Field Calculator and specifying that ZIP = LEFT ([NAME], 5) ( Figure 5-9). This indicated that ZIP would be equal to the first five characters from the NAME field which was populated with the ZIP code of the origin of the route and the CID of the campus which was the destination, so that 92374 -981 was the name of the route between the 92374 ZIP code centroid in Redlands and Crafton Hills College. Next the Join Field tool was used to add FREQUENCY fields to the commute route attribute tables. This was done by specifying the commute route feature class as the Input Dataset with ZIP as the Input Join Field, and the ZIP code centroid feature class of the related campus as the Join Table with FREQUENCY set as the Join Field. Another field named VMT, which stands for vehicle miles traveled, was added to each commute route attribute table. The purpose of this field was to store the total miles driven from each ZIP code to the campus. This was populated by multiplying the TOTAL_LENGTH by the FREQUENCY in the Field Calculator.

Summary
This chapter presented the tools and the methods used in performing the analysis that yielded the results that were used to calculate the remainder of the impacts of commuting that were of interest to the client. The vehicle miles traveled and the travel times that were calculated will be presented on their own, they will also be used to calculate averages; of distance traveled and travel time, the cost of the commute in dollars, and the vehicle carbon emission due to commuting students. These results, which are discussed in chapter 6, were presented in tabular form and in map form.

Chapter 6 -Results and Analysis
This chapter covers the many issues with the data, including questionable ZIP codes, unknown ZIP codes, and lack of information regarding off-campus centers. This chapter also provides details on results that were delivered to the client. The results include tabular summaries at both the campus and district level, and two sets of maps

Data Issues
Although the enrollment data received from the client accurately reported what each college had on record, it contained ZIP codes that were questionable. It was obvious that students were not commuting from locations such as Portland, Oregon to colleges in Los Angeles, California. The first step in removing this error from the data was to strip out all ZIP codes from the ZIP code centroid feature class which were located outside of California. This was done with the exception of students in neighboring states that appeared to be within a reasonable commuting distance from the college they were reported to attend. One example of out-of-state students who were retained in the enrollment data were students attending Lake Tahoe Community College whose ZIP code located them in Nevada (Figure 5-1).

Figure 6-1: Lake Tahoe Community College and Student ZIP Codes
After removing the majority of the out-of-state students, it was still apparent that there were many other students that were not making the commute from their reported ZIP code to the college they were attending. For example, it is not conceivable that a student would commute from Humboldt County to Los Angeles. In order to remove those students from the analysis, the buffer operation described in chapter 5 was used. Even after removing the students whose ZIP code placed them outside a 200 mile radius of the college they were attending, there were still students whose commute time was still questionable. With guidance from the client it was decided to only include the closest ninety percent of the students in the analysis and dropping the remaining ten percent.
Though the enrollment data contained the data that was reported by the individual colleges, it was far from complete in many cases. Only fifteen of the colleges had ZIP codes reported for 100% of the students, the remaining ninety for had unknown ZIP codes reported (Table 6). To address this problem, total vehicle miles traveled and vehicle carbon emissions values were refined by extrapolating to the 100% level; that is, the estimated values were adjusted by dividing by the percentage missing. The worst discrepancies of over one thousand students or 10%, with unknown ZIP codes were reported by eight of the colleges (Table 7). Another issue with the enrollment data is the lack of a way of knowing how many times per week a student commutes to the campus. All students that were indicated as distance education students were removed from the enrollment table, as reported in chapter 4; this still left the question of how often the remaining students made the commute. The enrollment table included a field that showed each student's credit hours, however this is their accumulated credit hours and not the credit hours for which they were enrolled during the fall of 2007. It was originally thought that the credit hours could be used to create a weight for the commute routes, however the client did not believe this to be a reasonable idea and it was not pursued. It became apparent that with the currently held data that there was no reasonable way to know how many trips per week each student made to campus.
Another issue has to do with the fact that each college may have one or more offsite centers or satellite campuses where students may attend some classes. The enrollment data did not include any information regarding whether the student was attending class at the main campus or one of the other sites. When this issue was raised with the client, the solution that was suggested was to simply ignore the existence of the centers and satellite campuses. The client explained that the number of students who attend class at locations other than the main campus was small when compared to the number of students attending class at the main campus. It was also suggested that many of those students who attend class at one of the off-site location also attend class at the main campus as well. There are two centers which are large enough to have their own CID's; however the client asked that they not be included in this project. One last issue is regarding the newest college to become accredited, Woodland Community College. The college was not an accredited California Community College until June 2008, after the enrollment data being used for this project was collected.

Results
As stated in the first chapter of this report, the results of this project were delivered to the client in two forms. The first form of deliverable results was a tabular summary of the estimated vehicle miles traveled, transit time, transit cost, and vehicle carbon emissions due to round trip commuting to each of the individual colleges. The second form was the maps that summarized the estimated vehicle miles traveled, transit time, transit cost, and vehicle carbon emissions due to round trip commuting..

Tabular Summary
Tabular summaries presenting the resulting travel related estimates of each campus and each district were created as a Microsoft Excel spreadsheet and then exported as a PDF for delivery to the client. The tables presented the total vehicle miles traveled, the average vehicle miles traveled, the average transit time, the total transit cost, the average transit cost, the total vehicle carbon emissions, and the average vehicle carbon emissions for each college campus and each district.
The first step in producing the tables was the export of all the commute route feature class attribute tables to comma delimited text files. The commute route text files were opened in Microsoft Excel and saved as Excel spreadsheets. The total number of students commuting to each college was determined by using Microsoft Access to find duplicates of the CID field in the enrollment database. This number was multiplied by 0.9 in order to determine how many students made up 90 percent of the commuting students. The commute route spreadsheets were then sorted by travel time from shortest commute to longest commute. A running total of students was calculated using the frequency field that was joined to the commute route table earlier. Then it was simply a matter of finding the route in the table where the running total of students was equal to or closest to the ninety percent value. The total vehicle miles traveled was calculated by summing the VMT field from the shortest commute through the commute found which had the running total equal to or closest to ninety percent of the students. The average was calculated as the total vehicle miles traveled divided by the number found to be equal to or closest to ninety percent of the students. The average transit time was calculated by first summing the Total_Time field from the shortest commute through the commute found to have the running total equal to or closest to ninety percent of the students, and then dividing the sum by the ninety percent running total number. At this point the values were adjusted to account for all of the commuting students that had been removed from the analysis, either because their ZIP codes were unknown or because they were outside the 200 mile or 90% -level thresholds. This adjustment essentially assumes that the average commute for all of the students is the same as the average commute for all of the students with valid ZIP code data. Although obviously an estimate, the result is a better estimate of the totals and averages for each campus, given that several campuses had missing or bad ZIP code data for more than 1000 or 10% of their students. This adjustment was effected by dividing the totals by the ratio of students with valid data to students with invalid data. For example, if a campus had 1000 students and a total vehicle miles traveled of 800,000 miles for the 800 students who had valid ZIP codes, were within 200 miles, and were within the 90% threshold, the total vehicle miles traveled would be adjusted by dividing by 800/1000, or 0.80, to yield an adjusted total vehicle miles traveled of 1,000,000 miles. This adjustment not only improves the estimate for all campuses with missing or marginal ZIP code data, but it adds back in the 10% of the students who were removed with the 90% threshold.
The remaining four fields on the table, total transit cost, average transit cost, total vehicle carbon emissions, and average vehicle carbon emissions were calculated by using the vehicle miles traveled field along with constants that were found to be used by the carbon dioxide emissions calculator at Carbonify.com and the driving cost calculator at CommuteSolutions.org. The total transit cost was calculated by multiplying the total vehicle miles by $0.65 per mile, which is the total cost per mile of insurance, registration, taxes, finance charges, vehicle depreciation, fuel, maintenance, and tires. The average transit cost was calculated by dividing the total transit cost by the number of commuting students. The total vehicle carbon emissions were calculated by multiplying the total vehicle miles traveled by 1.1 pounds per mile, which is the estimated amount of carbon emissions per mile for a medium size car. The carbon emissions estimation for a medium size car was used because it falls pretty evenly between the estimated carbon emission of a small car and a sports utility vehicle. The average vehicle carbon emissions were calculated by dividing the total vehicle carbon emissions by the number of commuting students. The tabular summary of this project at the campus level can be found in Appendix A of this report.
The second table, which summarizes the result of this project at the district level, was created by using the results found in the individual campus level table. The total vehicle miles traveled was calculated by summing up the total vehicle miles traveled of each campus in a given district. The average vehicle miles traveled was calculated by dividing the total vehicle miles traveled by the total number of commuting students from each campus in a given district. The average transit time was calculated by first summing the adjusted total transit times of each campus in the district and then dividing the sum by the total number of commuting students in the district. The total transit cost was calculated by summing the total transit cost of each campus in a given district, and the average transit cost was calculated by dividing the total transit cost by total number of commuting students in the district. The total vehicle carbon emissions and average vehicle carbon emissions were calculated in the same way which the total transit cost and average transit cost was calculated, only using the individual campus values of total vehicle carbon emissions rather that total transit cost. The tabular summary of this project at the district level can be found in Appendix B of this report.

Maps
Two different sets of maps were produced for this project. The first set was the maps requested by the client that summarized the estimated vehicle miles traveled, transit time, transit cost and vehicle carbon emissions for each college campus and each district. The second set of maps details the commuting zones of fifty percent and ninety percent of the students from each campus.
The set of maps requested by the client showed the results of this project at both the individual campus level and at the district level. Seven individual maps were made for each of the individual campus level results and each of the individual district level results. Each map is of one of the defined community college regions and contains either all the campuses or districts which fall within that district. The regions are geographic divisions of the state, they are Northern, Central, Bay Area, and Southern. The Southern region was further broken down into 4 sub-regions since it contains too many campuses and districts to show clearly on one map. A total of 14 maps were produced for each region or sub-region, seven at the campus level and seven at the district level. Each one of the seven maps at either the campus or district level displays one of the calculated results of this project, which are the round trip total vehicle miles traveled, average vehicle miles traveled, average transit time, total transit cost, average transit cost, total vehicle carbon emissions, and average carbon emission.
The first step in producing the client requested maps was importing the campus and district spreadsheets into the geodatabase. Next the Join Field tool was used to add the seven fields that contained the attributes that were needed to produce the maps to the All_Campus_Centroids and DistrictPolygons feature classes. The regional maps of campuses and districts were then produced using graduated symbols to symbolize the seven project results. The maps were then exported to PDF for delivery to the client. The set of campus level maps is found in Appendix C and the district level maps are found in Appendix D.
The second set of maps was produced to show the commuting zones of each campus and how many of the overlapped. Each map displayed the commuting zones for at least two college campuses. In order to produce the commuting zone maps it was first necessary to determine the transit time within which fifty percent and ninety percent of the students for each campus lived. Determining the number of students for each college that made up the ninety percent of the closest student was previously accomplished when producing the tabular results of the project. Determining the closest fifty percent was accomplished in the same way. Once both of these values were determined for each college it was just a matter of looking up the transit times on the commute route spreadsheets. The next step was to use the Network Analyst extension in ArcMap to compute service areas for each college campus using the transit times for the fifty percent and ninety percent of the closest students. The network dataset named streets, which is located on the network server and not in the geodatabase, was used as the input analysis layer in the Make Service Area Layer tool. Travel from or to facility was set to TRAVEL_TO, the impedance attribute was set to Time, and all other setting were left as is. The next step was to load a campus centroid feature class as the facility location, and then set the Default Break to the determined transit time on the Analysis Settings tab in the Layer Properties window (Figure 6-2).

Figure 6-2: Layer Properties Window
After the facility location was set to a campus centroid and the transit time was input the analysis was then run which produces as an output a polygon depicting all the area that is within the defined transit time. For Crafton Hills College fifty percent of the students live with 10.8 minutes of the campus, so the resulting polygon depicts the area within a 10.8 minute drive time of the campus (Figure 6-3).

Figure 6-3: 50% Commuting Zone for Crafton Hills College
The next step is to create the second commuting zone polygon. The second polygon depicts the transit time within which ninety percent of students live. The resulting polygon is then displayed on the map along with the first polygon ( Figure 6-4). The polygon was then exported to the geodatabase as a feature class that was only kept temporarily.

Figure 6-4: 50% and 90% Commuting Zones for Crafton Hills College
After creating both polygons for the first college campus and exporting them to the geodatabase the two polygons for the next college campus on the map were created and exported to the geodatabase as well. The next step was to use the Feature To Line tool to create line features of the polygon boundaries ( Figure 6-5). The lines were then simplified using the Line Simplify tool set to Bend_Simplify, with a reference base line chosen based upon the scale of the map and the size of the feature to be simplified, in order to make them more visually appealing for the final maps (Figure 6-6). After using the Line Simplify tool it was still necessary at times to manually edit the lines to produce smooth lines without any unwanted vertices that caused errant spikes where they did not belong. A buffer was created on the inner side of the lines to create the symbols that represent the commute zones ( Figure 6-7). On the Buffer tool window the distance was set to a reasonable length depending on the scale of the map, the Side Type was set to RIGHT so that only the inside of the lines would be buffered, and the end type was set to FLAT so the buffer would more accurately follow the shape of the line. At time the buffer would cross outside of the line due to sharp bends and would need to be trimmed to the inside of the line. When this was necessary the Feature to Polygon tool was used to create a polygon with the same shape as the line. The resulting polygon was then used in the Clip tool as the Clip Feature while the buffer was selected as the Input Feature.

Figure 6-7: Commute Zone Symbols for Crafton Hills College and San Bernardino Valley College
The next additions to the map were dot densities of the students commuting to the college campuses from the various ZIP codes. The first step was to select the ZIP code polygons in which commuting students live. This was accomplished by adding the ZIP code polygons and the ZIP code centroids of the students attending the college in question to the map and then using Select By Location to select features from the ZIP code polygons which contain the ZIP code centroids. The selected features were then exported to the geodatabase and named so that CID_981_ZIP200_Polygons is the ZIP code polygons of the students attending Crafton Hills College. Next the Join Field tool was used to join the FREQUENCY field of the ZIP code centroid feature class to the ZIP code polygon feature class by using ZIP as both the Input Join Field and the Output Join Field. This was required to be done twice since many of the ZIP code centroids are enclosed ZIP codes which are located within larger ZIP codes. When the Join Field tool was run the second time ENC_ZIP was set as the Input Join Field while the Output Join Field was again set to ZIP. This created a field in the ZIP code polygon attribute table called FREQUENCY_1. Because not every ZIP code polygon had an enclosed ZIP code centroid within it, many of the polygons had a null value for FREQUENCY_1. These null values were set to zero by selecting all the features with the null value in the FREQUENCY_1 field and then using the Field Calculator to set the value to zero. A new field was then added to the attribute table and named TOTAL_FREQUENCY, and then the field calculator was used to add FREQUENCY to FREQUENCY_1 to populate the new field. The ZIP code polygons were then symbolized as a dot density with each dot representing 10 students.
The maps were finished by adding the district and non-district land polygons, highways, cities, and the required labels. The set of commuting zone maps can be found in Appendix E.

Summary
This chapter highlighted the various issues and shortcomings that were found within the enrollment data, and described what was done to work around them. This chapter also introduced the resulting deliverables and detailed their production. The next chapter provides the conclusions of this project.

Chapter 7 -Conclusions and Future Work
This final chapter of the report provides a summary of how this project satisfied the requirements set forth by the Foundation for California Community Colleges. This chapter also presents possible future work than can follow from this project.

Conclusion
The objectives set forth by the client were to present estimates of specific factors and impacts of student commuting to the campuses of the California Community College system. The requirements stated the resulting estimates were to be presented in the form of tabular summaries and maps at the campus and district levels. The specific factors and impacts related to students commuting from home to campus that were required to be included in the results were: vehicle miles traveled, student transit time, student transit cost, and vehicle carbon emissions.
The results of this project met the requirements and objectives that were dictated by the client. The project provided estimates for the following factors and impact: • • Average Vehicle Carbon Emission The above factors and impacts were estimated for the round trip commutes that are made by the students between campus and home. The resulting estimates were presented as tabular summaries and map sets at both the campus and district levels.

Possible Future Work
The possibility exists to refine the estimates of the factors and impacts that were presented as the results of this project. That refinement would be based upon the ability to acquire more accurate data, more detailed data, and/or additional data.

Obtain More Accurate Data
As stated in chapter six, the enrollment data that was acquired for use with this project included many ZIP codes that were questionable, as well as many records with unknown ZIP codes. With accurate local residence ZIP codes for all students, it would be possible to calculate much more accurate estimates. It could be beneficial to undertake a project like this on a campus-by-campus basis and deal directly with the administrative offices which maintain student information. By dealing directly with the administrative offices it might be possible to insure that the ZIP codes are for local residences, not permanent addresses or billing addresses.

Obtain More Detailed Data
Also stated in chapter six, there are no details regarding the number of commutes each student makes per week to the campus and whether the student is commuting to the main campus or one of the off-site centers or satellite campuses. With the addition of the information about each student's schedule, it would be possible to compute good estimates of the commute related factors and impacts for an entire week or for an entire semester. Without that information, it was only possible to produce estimates bases on a single round trip commute for each student. This assumption effectively assumes that every student makes the same number of commutes, which is not very likely. More accurate estimates would also result if information regarding the off-site centers and satellite campuses was provided. It would be necessary to have the location of the offsite centers and satellite campuses, as well as the detailed attendance records. The offsite center and satellite attendance should be reflected in the students' schedules. Once again it might be beneficial to undertake this project on a campus-by-campus basis in order to have the support of the administration.

Incorporate Public Transportation Usage
It is not likely that every student commutes to campus by driving their own car; some are certain to use public transportation. The inclusion of information concerning the use of public transportation by students would yield more accurate estimates. Information concerning public transportation usage could be obtained by a survey of commuting habits.

Potential Applications / Post-Analysis
The results of this project have the potential to be used in further applications and analysis. Many questions can be asked concerning the distribution of the estimates of commute related factors and impacts. Why do certain campuses have higher or lower estimates than other campuses? What are the causes of higher than average values? The information presented in the commute zone maps leads to questions regarding the distribution of students, and why they choose a certain college over another one that might be located closer to home.