Case Study – Best Places to Live in California | Part 1 – Identify the Key Variables

SUMMARY

Businesses mine data in order to find patterns and correlations that may affect their business processes, products and services, and customer behaviors. When large amounts of data can be translated to patterns and correlations, the data then becomes information that the business can use for decision-making. A business professional who understands the analytics and can use the data to identify relationships, predict outcomes, and provide recommendations is invaluable to the company’s overall goals of efficiency, profitability, and growth.

Scenario
Home Trend is the top magazine in California and the go-to place for the best homes in the state as well as the best places to live in the cities and counties in Southern California. As the data analyst for Home Trend, you have been tasked with selecting the best places to live in the state based on variables provided by researchers for the company from a list of several metropolitan areas in California. The data sheet gives information for each of the 20 cities in California by county. You will write a report summarizing the data results, ranking the cities, and providing the key predictors for choosing the best place to live.

There are a number of variables that are provided to be used to assess the “quality of life” including:

  • Income, (average income per person; higher incomes are considered better in this scenario)
  • Commute (average daily roundtrip-in minutes; shorter commute is deemed better)
  • Job growth (forecast percentage, increase in the next 5 years; higher numbers are more positive outlook)
  • Physicians (number of doctors-per 100,000 population; higher numbers are deemed better for the city)
  • Murder rate (10-year average per 100,000; lower numbers are viewed more positively)
  • Rape rate (10-year average per 100,000; lower numbers are viewed more positively)
  • Golf (number of residents per golf hole; lower numbers are viewed more positively)
  • Restaurants (quality index-residents selected quality points- {in survey questions, one was for 5 star experience and five a no star rating} meaning -lower numbers are viewed more positively)
  • Housing (median home price; higher numbers are deemed more positive)
  • Median age (median age of residents)
  • Literacy (public library circulation per resident; higher numbers are viewed more positively)
  • Household income (average income per household; higher numbers are viewed more positively)
  • Recreation (Places rated by foremost recreation site-higher numbers indicate more recreation)
  • K–12 Schools (Top elementary and high schools rated by Great Schools from 1 “poor” to 10 “highest quality”)
Part 1: Identify the Key Variables
  1. Using the provided data sheet (opens in a new window), select a metropolitan/county area to analyze and perform a data analysis. (see attached)
  2. Identify the key variables to assess the “quality of life” by conducting the tasks below using Microsoft – R:
    • Clean the data provided to you in the data sheet.
  3. Analyze the data through regression and determine the key variables used to assess the “quality of life” from the list provided in the data sheet.
    • Submit a summary of the key variables that should be used to assess the “quality of life”.

 

 

Do You Know That our Professional Writers are on Stand-by to Provide you with the Most Authentic Custom Paper. Order with us Today and Enjoy an Irresistible Discount!