Tuesday, January 31, 2017

What kind of jobs can English Majors Get?


What kind of jobs can I get if I major in English? (Lots) Do I have to major in science to go to medical school? (No) Do actors have to go to a Theater program? (No).

All these sound like conventional wisdom, but now, thanks to my friends at Human Capital Research Corporation, we have some better answers.  The data set they put together is based on The American Community Survey (ACS) of the Census Bureau, a small but statistically significant sample of the US Population.  It asks questions that include occupation and college major (for those who are working, and for those who have a bachelor's degree).  The data below contains over 3 million individual responses to these questions (for people in the labor force between the ages of 25 and 60 with a bachelor's degree).

One the first dashboard (using the tabs across the top), you see two views.  On the blue chart on the left, choose a major (cluster) at the top.  The chart below will show you the professions (also clustered) of people with a bachelor's degree in that area.  Hover over a square for details, including the number and the percentage of the total.  Multiply by about 20 to convert the sample to the total.

One the red chart, choose the profession, and see the majors of the people working in that area.

Most engineers majored in engineering; most nurses in nursing, most teachers in education, and most accountants in business.  But beyond that, you get a rich sense of the wide range of careers open to people with almost any degree.

On the second tab, look at the majors on the left, and see how people are distributed by going across the row. Look for larger, blue bubbles to see clusters: 37% of people with a degree in library science, for instance, work as a librarian; 29% of architecture majors are architects.  The rows total 100%. Unfortunately, the number of professions makes labeling the professions impossible, except in the box that pops up when you hover.

Then, on the third tab, the view is the same, but the columns total 100%.  So you see the majors of people in professions.

On the last two views, the story is not the large bubbles, I think, although the add to understanding; the story is the small bubbles: People from all majors doing all jobs.

And a word of caution, of course: I defaulted the first two views to biology and medicine, and the tendency will be to conclude that you must be a science major to go to medical school.  In fact, this is likely driven by the fact that the vast majority of applicants to medical school major in the sciences.

What else do you see here? What surprised you?  Let me know in the comments below.

Wednesday, January 11, 2017

NY City Public Schools, and what they might tell us about the SAT

Recently, I received a message from Akil Bello who pointed out a data visualization he had seen.  It was originally posted to Reddit, but later was edited to eliminate the red-green barrier that people with color-blindness face.  The story was here, using a more suitable blue-red scheme.

There's nothing really wrong with visualizing test scores, of course.  I do it all the time.  But many of the comments on Reddit suggest that somehow the tests have real meaning, as a single variable devoid of any context.  I don't think that's a good way to analyze data.

So I went to the NY City Department of Education to see what I can find.  There is a lot of good stuff there, so I pulled some of it down and began taking a look at it.  Here's what I found.

On the first chart, I wanted to see if the SAT could be described as an outcome of other variables, so I put the average SAT score on the y-axis, and began with a simple measure: Eighth grade math and English scores on the x-axis. Hover over the regression line, and you'll see an r-squared of about .90.

Scientists would use the term "winner, winner, chicken dinner" when getting results like this.  It means, for all intents and purposes, that if you know a high school's mean 8th grade achievement scores, you can predict their SAT scores four years later with amazing accuracy.  And--here's the interesting thing--the equation holds for virtually every single school.  There are few outliers.

Ponder that.

But critics of the SAT also say that the scores are reflective of other things, too; an accumulation of social capital, for instance.  So use the control at the bottom to change the value on the x-axis.  Try economic need index, or percentage of students in temporary housing, or percentage of the student body that are White or Asian. The line may go up (positive correlation) or down (negative) but you'll always see the schools with the highest scores tend to have the characteristics you'd expect.

Jump to the second tab.  This is more a response to the Reddit post: The top map shows the ZIP codes and a bubble, indicating the number of schools in that ZIP.  The bottom map shows every school arrayed on two poverty scales: Economic Index and Percent in Temporary Housing.  The color shows the mean SAT score in the school (Critical Reading plus Math, on a 1600-point scale.)  Purple dots represent higher scores.

Use the ZIP highlighter, and you'll see the top map show only that bubble, and the bottom will show the schools in it.

Got the lesson?  Good.  Now, think about why the colleges with high median test scores a) have them, and b) tend to produce students with high GRE and MCAT and LSAT scores,  and c) point to excellent outcomes for their students.

And let me know what you think.






Wednesday, January 4, 2017

The Outlook in Illinois

Much of what I post here is slightly modified from what I use at work, and this is no exception.  Here at DePaul (like most universities) the biggest single slice of enrollment comes from our own state, and it's important to know what's going to be happening to the student markets in the future.

So I downloaded data from The Illinois State Board of Education showing enrollments for two years: 2010--2011 and 2015-2016 to see how things have changed over time, and to get a glimpse of the future.  This is a more granular look than the WICHE data I visualized recently, but it's also not actual projections going forward, but rather just numbers; projections require a lot of time and mathematics skills, neither of which I have.  I would have liked to gone deeper and farther with this, but the data are messy, and even things like School District IDs have changed over time.

There are four views using the tabs across the top: First by region, then county-by-county, and then a scattergram showing each county by both percent change and numeric change over time.  On each, make a choice at the of the page to change the data displayed: You can look at total pre-K through 12 enrollment, if you like (the default view) of you can change to show grade-level enrollments, or by ethnicity or low-income status.

Finally, the last tab shows individual schools.  You can type part of your school's name in the drop down box to start filtering, but be sure you find the county as well as the school.  If you're going to be looking up "Lincoln," you've got a lot of work to do!  Also, some schools have their name listed slightly differently in different years, and if your school is one of them, you won't get two years of data showing.

Please note: The data are not granular, so you can't combine variables (for instance, low-income students in 8th grade.)  And, I've excluded small numbers from the analysis (students in juvenile detention centers, or public school students being education at other sites.)

But it's still interesting, I think, especially if you drill down a bit using the filter at the top.

What do you see? Leave a note in the comments.