Category Archives: Data

Randomized control trial using masks.

Economists and perhaps others will benefit from seeing the results of this large-scale randomized controlled trial on wearing masks, in Bangledesh, which studied N=342,126 adults with three study arms: cluster randomized Villages and households with no intervention, with free cloth masks, and with free surgical masks. Participants also received information and local  reminders.

 

Here is the Yale research paper by economist Jason Abaluck et al describing the experiment.

The Impact of Community Masking on COVID-19: A Cluster Randomized Trial in Bangladesh. https://elischolar.library.yale.edu/cowles-discussion-paper-series/2642/

August, 2021

 

Here is a popular summary in the Atlantic from September 4, 2021

https://www.theatlantic.com/ideas/archive/2021/09/masks-were-working-all-along/619989/

 

The study shows the superiority of surgical masks over cloth masks, and the both sets of masks achieved a roughly ten percent reduction in symptomatic seroprevalence. This is less than perfect, but the study interventions were only able to increase the wearing of masks from 13% to 42% and raise social distancing from 24% to 29%. Recall that the study was done in Bangladesh.

Hourly weather forecasts for US

I rely upon my cell phone for hourly forecasts of rain and weather, but have not known how to get hourly forecasts more than 24 hours into the future. Yesterday a friend sent me the following National Weather Service link that forecasts hourly weather up to six days ahead.  Of course, it loses precision, but still, sometimes you need to make decisions far into the future. All the usual radar maps and other information are on nearby clicks.  All free and thanks to NOAA. Enjoy.

https://forecast.weather.gov/MapClick.php?lat=42.36&lon=-71.06&unit=0&lg=english&FcstType=graphical

Alas, it only does forecasts for the US.

The unsurprising tragedies of the Afghanistan war

As we ponder the tragedies of the US withdrawal from Afghanistan, it is important to also remember the costs of our continuing. An excerpt from the Boston Globe is pasted in below.

The bottom line is that the wars in Afghanistan and Iraq will have cost the US at least $4 trillion dollars (excluding interest costs) which is 4 million million dollars. Given that the US has only 132 million households, this spending averages to over $30,000 per household, all paid for by debt we will eventually have to pay off (unlike previous wars, taxes were not increased).

$4 trillion could have instead been invested in free public university tuition, free health care for all children, reducing climate change, or the $4 trillion infrastructure bill that President Biden is asking for.

I commend President Biden for actually doing what presidents Bush, Obama and Trump all said they wanted to do but did not.

Joy and I have been listening to the audiobook “The Father of All Things: A Marine, His Son and the Legacy of Vietnam” which covers the fall of Saigon in Vietnam and subsequent events. There should be no surprise that the events in Kabul are the consequence of war. Even the speedy fall of the government.

The Father of All Things: A Marine, His Son – Amazon.com

https://www.amazon.com › Father-All-Things-Vietnam-…

 

Below is from The Boston Globe on Tuesday 8/16/2021.

Costs of the Afghanistan war, in lives and dollars

By ELLEN KNICKMEYER The Associated Press, Updated August 16, 2021, 5:00 p.m.

 

https://www.bostonglobe.com/2021/08/16/nation/costs-afghanistan-war-lives-dollars/

_________________________

The longest war:

Percentage of US population born since the 2001 attacks plotted by Al Qaeda leaders who were sheltering in Afghanistan: Roughly one out of every four.

The human cost:

American service members killed in Afghanistan through April: 2,448.

US contractors: 3,846.

Afghan national military and police: 66,000.

Other allied service members, including from other NATO member states: 1,144.

Afghan civilians: 47,245.

Taliban and other opposition fighters: 51,191.

Aid workers: 444.

Journalists: 72.

Afghanistan after nearly 20 years of US occupation:

Percentage drop in infant mortality rate since US, Afghan, and other allied forces overthrew the Taliban government, which had sought to restrict women and girls to the home: About 50. (RE note: Statistica still lists the IMR at 5% (“about 46.5 per 1000”) of all live births in 2019. This is still an abysmal rate: 1 in 20 infants are dying.)

Percentage of Afghan teenage girls able to read today: 37%. (RE note: World Bank data show it as roughly doubling since 2011. Still appalling.)

Oversight by congress:

Date Congress authorized US forces to go after culprits in Sept. 11, 2001, attacks: Sept. 18, 2001.

Number of times US lawmakers have voted to declare war in Afghanistan: 0.

Number of times lawmakers on Senate Appropriations defense subcommittee addressed costs of Vietnam War, during that conflict: 42

Number of times lawmakers in same subcommittee have mentioned costs of Afghanistan and Iraq wars, through mid-summer 2021: 5.

Number of times lawmakers on Senate Finance Committee have mentioned costs of Afghanistan and Iraq wars since Sept. 11, 2001, through mid-summer 2021: 1.

Paying for a war on credit, not in cash:

Amount that President Truman temporarily raised top tax rates to pay for Korean War: 92 percent.

Amount that President Johnson temporarily raised top tax rates to pay for Vietnam War: 77 percent.

Amount that President George W. Bush cut tax rates for the wealthiest, rather than raise them, at outset of Afghanistan and Iraq wars: At least 8 percent.

Estimated amount of direct Afghanistan and Iraq war costs that the United States has debt-financed as of 2020: $2 trillion.

Estimated interest costs by 2050: Up to $6.5 trillion.

The wars end. The costs don’t:

Amount Bilmes estimates the United States has committed to pay in health care, disability, burial and other costs for roughly 4 million Afghanistan and Iraq veterans: more than $2 trillion.

Period those costs will peak: after 2048.

Source of the above: Much of the data is from Linda Bilmes of Harvard University’s Kennedy School and from the Brown University Costs of War project. Because the United States between 2003 and 2011 fought the Afghanistan and Iraq wars simultaneously, and many American troops served tours in both wars, some figures as noted cover both post-9/11 US wars.

 

Yes, even rich white people in the US get bad health care

Despite the abundant evidence2 showing that health care outcomes in the US are much worse than in every other OECD country, I still hear arguments that this is because uninsured, Medicaid, minorities, or low-income people in the US bring down our health outcomes. This myth is repeated35, and believed by a majority of Americans. 6 This JAMA study shows that this is not true. Even high-income white people get worse health outcomes than the average result in OECD countries. Time to change to a better health care system!

 

Key Points

Question  Are the health outcomes of White US citizens living in the 1% and 5% richest counties better than the health outcomes of average residents in other developed countries?

Findings  In this comparative effectiveness study of 6 health outcomes, White US citizens in the 1% and 5% highest-income counties obtained better health outcomes than average US citizens but had worse outcomes for infant and maternal mortality, colon cancer, childhood acute lymphocytic leukemia, and acute myocardial infarction compared with average citizens of other developed countries.

Meaning  For 6 health outcomes, the health outcomes of White US citizens living in the 1% and 5% richest counties are better than those of average US citizens but are not consistently better than those of average residents in many other developed countries, suggesting that in the US, even if everyone achieved the health outcomes of White US citizens living in the 1% and 5% richest counties, health indicators would still lag behind those in many other countries.

JAMA Intern Med. 2021;181(3):339-344. doi:10.1001/jamainternmed.2020.7484

Comparing Health Outcomes of Privileged US Citizens With Those of Average Residents of Other Developed Countries

Ezekiel J. Emanuel, MD, PhD; Emily Gudbranson, BA; Jessica Van Parys, PhD; et al.

December 28, 2020

BUHealth: UK/South African COVID strains are at BU; BU testing looks great; BU plans in-person commencement!

I greatly enjoyed reading about how BU is using its extensive research laboratory resources to test for the presence of the UK and South African variants at BU. This report includes the 70 cases of COVID-19 detected in members of the BU faculty, staff and students during the week of Feb 17-23. Below are a few selected quotes.

Boston University Weekly COVID-19 Report: February 17 to 23

BU has begun sequencing COVID samples for variants; two variants that first emerged in South Africa, UK already detected at BU

Of the positive tests sent to the NEIDL for sequencing since January 25, more than 130 samples have contained enough viral material to allow them to be sequenced.

… thus far, we have detected eight samples containing a COVID variant of concern. Specifically, we have detected two variants of concern: one case of the B.1.351, first detected in South Africa, and seven cases of the B.1.1.7, first detected in the UK. We were not surprised by these results—they confirm what we already suspected, that those two variants have reached our community. “

 

It was informative to me to learn that BU is not able or allowed to tell people which variant they have if infected.

“For regulatory reasons, BU is not permitted to tell individuals if they have a variant form of COVID-19. The scientists who are doing this study are not even aware of which person the samples they are sequencing came from; they just know the virus sample was collected from someone at BU. “

even if we could tell individuals that they had been infected with a COVID-19 variant, that knowledge wouldn’t change our clinical management of that person’s illness.”

 I am fortunate to be part of BU’s comprehensive testing. It is available daily on the COVID-19 dashboard, with testing results as of two days ago.

I only wish that more people had such excellent testing available. I have not seen any recent estimates of the cost to BU of doing these COVID tests, but an early guess was $12 per test. I think a lot of people would be willing to pay $12 (weekly) or $25 (biweekly) for careful testing, which is the cost per faculty member or undergraduate of BU’s testing program. BU is continuing its hybrid teaching, with students in many classes allowed to choose between in-class and remote zooming.

 Based on these low current testing and vaccination efforts at BU, BU announced this week that it will be holding in-person graduation ceremonies on May 16 (graduates only) as long as the city and state allow it. Link is here and below. Go BU!

In-Person Commencement for BU Class of 2021 Planned for May 16, unless City Requires Virtual Ceremony

Class of 2020 will gather October 2 for virus-delayed ceremony: both will be for graduates only

 

Congratulations to BU’s Class of 2018 Economics graduates!

I have been on sabbatical, and hence am late to do this calculation and blog. Alas my sabbatical has come to an end.

Please celebrate the BU students who earned 590 degrees in Economics at Commencement this May!

This year the program honors:

18 Ph.D. recipients

231 Master’s degree recipients (MA, MAPE, MAEP, MAGDE MA/MBA, BA/MA)

341 BA recipients (including BA/MA)

This total of 590 degrees is up from 556 (6%) since 2017, following a 12% increase in 2017.

These numbers may undercount the total for the year since they may exclude some students who graduated in January.

In 2018 there were 18 PhDs, 231 Master’s degree recipients, and 341 BA recipients

In 2017 there were 14 PhDs, 215 Master’s degree recipients, and 327 BA recipients

In 2016 there were 22 PhDs, 203 Master’s degree recipients, and 273 BA recipients

In 2015 there were 22 PhDs, 155 Master’s degree recipients, and 305 BA recipients.

In 2014 there were 17 PhDs, 207 Master’s degree recipients, and 256 BA recipients.

Altogether 19 Ph.D. students agreed to be shown as obtaining jobs on our placements web site this year (versus 13 in 2017 and 23 in 2016). Not all graduates choose to be listed.

To see the Ph.D. placements visit the web site linked here.

http://www.bu.edu/econ/gradprgms/phd/placements/

The department’s  website now lists 38 regular faculty (down one from last year) with titles of assistant, associate or full professors, a number which is 3 below the number of professors in 2014 (our peak year) as listed on the commencement programs. Here are the recent counts of faculty.

2018: 38 tenured or tenure-track faculty, of which 4 are women (10%); 12 non-TT faculty, of which 3 are women (25%); 50 total faculty, of which 7 are women (14%). Of the TT faculty, there are 11 assistant (2 women), 7 associates (1 woman), and 23 full professors (1 woman)

2017: 39 tenured or tenure-track faculty, of which 3 are women (8%); 12 non-TT faculty, of which 3 are women (25%); 51 total faculty, of which 6 are women (12%)

2016: 38 tenured or tenure-track faculty, of which 5 are women (8%);  7 non-TT faculty, of which 1 are women (14%); 47 total faculty, of which 6 are women (12%)

2015: 40 tenured or tenure-track faculty, of which 5 are women (12%); 7 non-TT faculty, of which 2 are women (29%); 47 total faculty, of which 7 are women (15%)

2014: 41 tenured or tenure-track faculty, of which 6 are women (15%); 4 non-TT faculty, of which 1 is a woman (25%); 45 total faculty, of which 7 are women (16%)

http://www.bu.edu/econ/faculty-staff/faculty-profiles/

Congratulations to all!

US health spending and global burden of disease

I want to thank Veronica Vargas for sending me the following link from the Institute for Health Metrics and Evaluation (IHME) , which features innovative ways of displaying different cuts of US and international data from a massive data files. Viewing this site will perhaps take you fifteen minutes or more to get a feel. It is staffed by the University of Washington, but appears to be funded largely by the Gates Foundation. It has been around for a while, but they are making a big push on its features this fall.

The first link decomposes spending in the US by  disease, by broad type of service (pharmacy, IP, OP, Dentist, ER).

They document the well-known result that about half of the US increase is due to price increases, not intensity or illness, although aging and pop growth contribute.US costs are higher than the rest of the world largely because our prices paid for all types of care are much higher than elsewhere. And increasingly so.

 

Here is a direct link to the interesting interactive figures. Try the four different tabs across the top if you are curious. (Is a little slow on my wireless laptop.)

https://vizhub.healthdata.org/dex/

It allows you to drill down to questions such as how much was spent on individual disease for certain ages, on emergency department.

If you click on “visualizations“ in the upper right, you get different views that can be plotted, which are very extensive.

Or start here http://www.healthdata.org/results/data-visualizations

 

Below is a link to the article originally posted, along with a sample figure.

 

Factors associated with increases in US health care spending, 1996–2013

Here is one that lets you choose one or compare two or more countries disease burdens along multiple dimensions.

https://vizhub.healthdata.org/gbd-compare/

The say that their mission includes sharing data for researchers. Here is a link to various data that they support and document with a nice search tool.

http://ghdx.healthdata.org/data-by-type

 

Global Burden of Disease module lets you answer questions as specific as how many people die of air pollution in India in 2013.

Here is how they describe it.

September 14, 2017

GBD Compare

Data Visualization

Learn more

Analyze updated data about the world’s health levels and trends from 1990 to 2016 in this interactive tool. Use treemaps, maps, arrow diagrams, and other charts to compare causes and risks within a country, compare countries with regions or the world, and explore patterns and trends by country, age, and gender. Drill from a global view into specific details. Compare expected and observed trends. Watch how disease patterns have changed over time. See which causes of death and disability are having more impact and which are waning.

This is not a site oriented toward hypothesis testing, although it does include confidence intervals on many estimates (which seem to only reflect sampling precision, not other sources of uncertainty such as the quality of the underlying data.) For me, the main use will be in writing in the introduction of a paper so as to summarize how large a problem is, or how many people have a given condition, or how it is growing etc. The international breadth is stunning. At a different level, it is a good example of how big data can be manipulated using “cubes” and different cuts of the data to show fascinating patterns (girls less than 1 year cost $11,000 each on average, which drops to $1,600 age 5-9, and it is not until age 65 in the US that female mean cost is again over $11,000. It peaks at $31,000 per year over age 85.)

 

Be forewarned: you can spend a lot of time playing around…

 

 

Excellent articles about machine learning and replication

There is a wonderful article about Machine learning in the spring 2017 issue of the Journal of Economic Perspectives, and there is also a series of four fine articles in the AER May 2017. I decided to share as a BUHealth blog to all.

Whether you are curious, newly interested or an expert working in the area, I recommend the JEP one to you. The AER series is for more serious work. Here are the links (They should all be free to access, since they are all at the AEA.) Also see below for links on replication.

Machine Learning: An Applied Econometric Approach

Download Full Text PDF
(Complimentary)

 

Machine Learning in Econometrics (May, 2017)

Double/Debiased/Neyman Machine Learning of Treatment Effects

Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen and Whitney Newey

(pp. 261-65)

Testing-Based Forward Model Selection

Damian Kozbur

(pp. 266-69)

Core Determining Class and Inequality Selection

Ye Luo and Hai Wang

(pp. 274-77)

Estimating Average Treatment Effects: Supplementary Analyses and Remaining Challenges

Susan Athey, Guido Imbens, Thai Pham and Stefan Wager

(pp. 278-81)

 

The series in the AER on Replication in microeconomics will also be of interest.  This article title speaks for itself.

A Preanalysis Plan to Replicate Sixty Economics Research Papers That Worked Half of the Time

Replication in Microeconomics

Assessing the Rate of Replication in Economics

James Berry, Lucas C. Coffman, Douglas Hanley, Rania Gihleb and Alistair J. Wilson

(pp. 27-31)

Replications in Development Economics

Sandip Sukhtankar

(pp. 32-36)

Replication in Labor Economics: Evidence from Data, and What It Suggests

Daniel S. Hamermesh

(pp. 37-40)

A Proposal to Organize and Promote Replications

Lucas C. Coffman, Muriel Niederle and Alistair J. Wilson

(pp. 41-45)

Replication and Ethics in Economics: Thirty Years after Dewald, Thursby, and Anderson

What Is Meant by “Replication” and Why Does It Encounter Resistance in Economics?

Maren Duvendack, Richard Palmer-Jones and W. Robert Reed

(pp. 46-51)

Replication and Economics Journal Policies

Jan H. Höffler

(pp. 52-55)

Replication, Meta-analysis, and Research Synthesis in Economics

Richard G. Anderson and Areerat Kichkha

(pp. 56-59)

A Preanalysis Plan to Replicate Sixty Economics Research Papers That Worked Half of the Time

Andrew C. Chang and Phillip Li

(pp. 60-64)

 

 

$147 Billion: The Economic Cost of Trump Racism

Bottom line:

Trump’s racism predicted to cost US households $147 billion in extra payments to the rest of the world.

Like many people, I am appalled by president Trump’s recent executive order banning refugees – and even US legal immigrants  – from seven predominantly Muslim countries from entering the country. In the process, Trump has also angered most of the world’s 1.6 billion Muslims, comprising about 23% of the global population. Many of these countries (e.g,. OPEC members) hold major financial assets in the US. President Trump has also been active about insulting Mexico and China, two other huge financial partners with the US.

Since Trump is a businessman, I am going to focus on why this racism is a really bad idea for the US economy.

According to US Treasury, US national debt held by the public as of last Friday, Jan 26, 2017 was 14.4 trillion dollars.  That is over $42,000 borrowed on your behalf per US resident. Last year (2016), the interest paid on that debt was $432 billion, or over $3,456 per American household  per year. (US census numbers estimate 125 million households currently). Of that total debt, about 34% is owned by international Investors, and we are paying them the interest on their holdings. So that is $1,175 per US household being paid out to foreigners last year before Trump became president.

Although my good colleague Larry Kotlikoff worries that this level of debt, particularly to foreigners, is not sustainable, that is not what I want to focus on here. I want to focus instead on the CHANGE in these debt payments to foreigners that can be attributed to scary Trump’s racism. According to the current US Treasury documents, the long term interest rate has increased by more than a half percentage point since Trump was elected. Look at the figure yourself. The increase when Trump was elected was immediate. Even faster than the stock market advance.

https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/Historic-LongTerm-Rate-Data-Visualization.aspx

Increasing the interest rate on the national debt from 2 percent to 2.5 percent costs Americans an extra $294 per household EVERY YEAR until our debt is paid off, which will probably be never.

Now imagine that OPEC, or the Chinese, or the Mexicans decide that they are not so happy with the US anymore and decide to start dumping their $6.8 trillion of US debt. We should all expect that  Trumps racist policies will cause federal interest rates to increase to 4% per year, up another 2% above its rates before Trump. This doubling of US debt interest rates will result in an additional $1,175 per US household per year being paid out to foreigners. Or another $147 billion in total interest paid out to the rest of the world for our debts.

That is my prediction of what will happen if Trump does not change his racist policies.

Notice that I did not have to do any calculations based on the effects of Trump’s racism on foreign tourism in the US ($168 billion in 2012), spending on US universities by foreigners ($30 billion in 2015), or US exports to China  ($113 billion in 2015) or Mexico ($267 billion in2015).  Trump’s tax cuts and deficit spending policies are also likely to increase interest rates. It would be easy for much larger estimates to be generated.

Take home lessons:

  • Racist policies are bad ethically and bad for our economy.
  • There is a real danger of serious interest rate increases that will cost everyone a lot of money.
  • Bond prices are likely to fall and long term bonds seem like a poor investment choice.
  • It helps with your arguments if you use facts and citations instead of making things up.
  • So far, instead of having Mexico pay for Trump’s Stupid Wall, Trump’s racist policies are making Americans pay more to Mexico and China and Iran and Saudi Arabia, and….

us-treasure-long-ten-year-interest-rates-2017-01-29

 

 

Read this posting on Stupid Economics

I invite you to read this Forbes posting on Stupid Economics by Laurence Kotlikoff.

 

I don’t always agree with my dear colleague, Larry Kotlikoff, but this posting at Forbes is one that I can really get behind.

Our president needs to start listening to serious economists instead of acting solo as an autocrat.

 

This article is a two minute read that will make you smile regardless of whether you agree with all of the sentiments.

 

Larry is a serious scholar, of course, and his credentials include not only nineteen books, but also a stint on the Council of Economic Advisors under president Reagan.

 

From: owner-faculty-econ-list@bu.edu [mailto:owner-faculty-econ-list@bu.edu] On Behalf Of Laurence Kotlikoff
Sent: Thursday, January 26, 2017 1:10 PM
To: faculty-econ-list; phd-econstudents-list
Subject: This may be of interest.

https://www.forbes.com/sites/kotlikoff/2017/01/26/stupid-economics/#61453f38e94f

 

 

Let me know if I’m being unfair. But I think it’s time to call this for what it is.

 

best, Larry

 

Laurence J. Kotlikoff

A William Fairfield Warren Professor, Boston University
Professor of Economics, Boston University
270 Bay State Rd.
Boston, MA 02215
www.kotlikoff.net

President, Economic Security Planning, Inc.

www.maximizemysocialsecurity.com

www.esplanner.com

www.economicsecurityplanning.com

kotlikoff@gmail.com
cell  617 834-2148
work 617 353-4002

BU Grads Ranked among the World’s Most Employable

One more ranking in which BU rates very highly in the world.

BU Grads Ranked among the World’s Most Employable
11th worldwide, 7th in the nation in international survey

The employability of BU graduates was recently ranked 11th in the world and 7th in the nation in a report published in Times Higher Education. The Global University Employability Ranking 2016 was designed by French human resources company Emerging, which sent an online survey asking the opinions of thousands of recruiters at a management level and of managing directors of international companies.

The California Institute of Technology was ranked number one on the list, followed by the Massachusetts Institute of Technology, Harvard University, the University of Cambridge, and Stanford University.

“It’s very heartening that so many employers recognize that our graduates are very well-prepared in their fields and have the skills and habits to perform at a high level,” says President Robert A. Brown. “Helping to successfully launch the careers of our graduates is a focus of the University.”

Except is from Bostonia Magazine.

Obama’s JAMA article is a must read for all professionals

There is a very important  article in this week’s JAMA – Internal Medicine, written by Barach Obama.

It highlights the effects of the ACA/Obamacare.  It is free on-line.

United States Health Care Reform: Progress to Date and Next Steps

http://jama.jamanetwork.com/article.aspx?articleid=2533698

If you are short on time, then the following link to just the figures provides many of the key results.

http://jama.jamanetwork.com/article.aspx?articleid=2533698

To me the highlights of the article are that it documents:

The decline in the uninsured (no surprise, but well presented) now down to 9.1 percent from over 16
Declines in teen smoking from 19.5% to 10.8% due to the Tobacco Control Act of 2009 (Wow)
Much slower rates of decline in the uninsured in states that refused the Medicaid expansion (no surprise)
The decline in the underinsured among privately insured as measured by the near disappeance of unlimited exposure (new to me)
Lower rates of individual debt sent to a collection agency (great to see)
Negative rates of real cost growth in Medicare and Medicaid since 2010, with drastically lower growth in privately Insured
Constant share of out of pocket spending as a fraction of total spending among the employer based insurance
(new to me, he cites increases in deductibles offset by decreases in copays and coinsurance.)
Forecast Medicare spending in 2019 is now 20% LOWER than when he took office.
Decline in Medicare 30 day, all hospital readmission rates as well as improvements in other measures.
This information is important to understand to counter the repeated false claims that Obamacare is a failure, or has increased health care spending, or is bankrupting the government, all of which are shown to be false in the evidence presented here.

Here is the link again.

http://jama.jamanetwork.com/article.aspx?articleid=2533698

Top 100 Economics Blogs of 2016

I just got an email from Prateek Agarwal <prateek@intelligenteconomist.com>

He has compiled a list of the Top 100 Economics Blogs of 2016. I am of course not on it since I blog infrequently and do not archive (and make public) on my web site all of my blogs, but I thought I would share the link he provided.

https://www.intelligenteconomist.com/top-economics-blogs-2016/

Lots of interesting links, including The Incidental Economist, which is the only one I subscribe to. Be warned that reading blogs can be a major time waster..

I plan to archive this one on my web site blog.

Hope to see many of you at ASHEcon. (Not too late to sign up for the dinner)

Best.

You should get a large SSD hard drive

This email will interest anyone who is processing very large data files, such as 5 GB or more. Or, if you are frustrated with how long your Window’s updates and other IO intensive tasks take. (Most Apple users are probably already using SSD drives.)

In October the hard drive on my Windows Desktop failed (not entirely: it just became erratic), so I had to buy a replacement.

Old hard Drive: 1 Terabyte, spinning drive 7200 RPM, 2011 vintage
New hard drive: 1 Terabtye SSD solid state drive, 2015 Only $389 at Microcenter (some are cheaper now).

The BU IT department was able to clone my original hard drive so that I did not have to reinstall any of the software. It ran from the time I turned it on, except that it was much much faster. How much faster? Five to ten times faster on IO bound tasks. These graphs show the difference.

oldvsnew

Various N

Times using mostly 0-1 binary regressors are much faster than continuous variables, since they can be compacted so nicely.

This graph is just illustrating that SAS can handle very big matrices well, although the sample sizes were fixed at 10k.

various kIf you are doing a lot of computationally intensive work on moderate size data, then faster CPU, multiple processors, and more RAM is critical. If you are processing Big Data, where the data is larger than your memory, then fast hard drives are the key. SSD drives are 5-10 times faster for most IO tasks.

You can also upgrade your laptop to a SSD drive with lots of capacity. New laptops with SSD and adequate memory will be faster on big data than your desktop with conventional hard drive. I am planning to get one to upgrade my old laptop. I will try to do a better job benchmarking before and after with that upgrade.

For sale at MicroCenter.com (Cambridge) near BU.

I purchased:
Samsung 850 EVO Series 1TB SATA III 6Gb/s mSATA Internal Solid State Drive Single Unit Version MZ-M5E1T0BW
Now $399.99, and was as low as $319. I predict prices to go down for Black Friday next week, and that prices were increased to ready for that “sale”.
“The MZ-M5E1T0BW from Samsung utilizes innovative 3D V-NAND Technology for incredible Read/Write Performance with enhanced endurance and reliability, giving you the most evolved SSD for Ultra-thin Laptops and PCs”

One review, probably by an employee…:
“All I can say is if you have an available mSATA slot open- just do it! That old spinning HD is killing your battery life! They only last a couple of years before they crash!! This is a solid state disk drive – No moving parts to wear out. Especially if you’re accident prone like me and drop it. SSD’s do not have head crashes like spinning hard drives.
The biggest bang for the buck is the performance! the read and write speeds are instantaneous! No waiting at all. Those Microsoft updates that take hours now take minutes. The mSATA drive is very easy to install. The bay is usually under the keyboard (Two screws to remove – Google it for instructions)- just get disk cloning software and follow the instructions. Remove the old spinning piece of rust and you’re off to the races! You can even get an external case to put your old hard drive in and use it for a backup. This little upgrade may breath new life into that old laptop – saving you from having to buy a newer one for a couple of more years…. I’m sure the laptop manufacturers don’t want to hear that!”

You will also need a mounting bracket to hold it in place. I used for a normal 3.5” slot:
http://www.microcenter.com/product/445921/850_EVO_Series_1TB_SATA_III_6Gb-s_mSATA_Internal_Solid_State_Drive_Single_Unit_Version_MZ-M5E1T0BW
Kingwin Internal Dual 2.5″ HDD/SSD to 3.5″ Plastic Mounting Kit
http://www.microcenter.com/product/396605/Internal_Dual_25_HDD-SSD_to_35_Plastic_Mounting_Kit?rf=Add-Ons%3EDrive+Rails%3E

Cheaper now is:
Crucial BX100 1TB SATA III 6Gb/s 2.5″ Solid State Drive CT1000BX100SSD1
$309.99 in-store only
http://www.microcenter.com/product/443434/BX100_1TB_SATA_III_6Gb-s_25_Solid_State_Drive_CT1000BX100SSD1

“Outlast and outperform your hard drive. Boot up almost instantly. Load programs in seconds. And accelerate demanding applications with ease. It all starts with ditching your hard drive. Engineered to outperform a hard drive and deliver cost-effective performance, the Crucial BX100 leverages advanced flash memory technology and moves your computer beyond the outdated storage limitations of spinning discs. By transmitting data in a digital manner rather than having to seek it out on a spinning platter, the Crucial BX100 is over 15x faster, 2x more reliable, and 2x more energy efficient than a typical hard drive.”

You will also need an adapter kit it make it fit in the larger size 3.5″ hard drive bays in most PC desktops. Such as

Vantec Dual 2.5″ to 3.5″ Hard Drive Mounting Kit $6.49
http://www.microcenter.com/product/398011/Dual_25_to_35_Hard_Drive_Mounting_Kit?rf=Add-Ons%3EDrive+Rails%3E

Talk to the staff about the computer you are putting the new hard drive into to get the right adappter kit.

Get help with installing it if you are not experienced. BU IT took less than a day (three hours) to install mine once it was purchased.

 

Ellis SAS tips for experienced SAS users

If you are a beginning SAS programmer, then the following may not be particularly helpful, but the books suggested in the middle may be. BU students can obtain a free license for SAS to install on their own computer if it is required for a course or research project. Both will require an email from an adviser. SAS is also available on various computers in the economics department computer labs.

I also created a Ellis SAS tips for new SAS programmers.

I do a lot of SAS programming on large datasets, and thought it would be productive to share some of my programming tips on SAS in one place. Large data is defined to be a dataset so large that it cannot be stored in the available memory. (My largest data file to date is 1.7 terabytes.)

Suggestions and corrections welcome!

Use SAS macro language whenever possible;

It is so much easier to work with short strings than long lists, especially with repeated models and datasteps;

%let rhs = Age Sex HCC001-HCC394;

 

Design your programs for efficient reading and writing of files, and minimize temporary datasets.

SAS programs on large data are generally constrained by IO (input output, reading from your hard drives), not by CPU (actual calculations) or memory (storage that disappears once your sas program ends). I have found that some computers with high speed CPU and multiple cores are slower than simpler computers because they are not optimized for speedy hard drives. Large memory really helps, but for really huge files it can almost almost be exceeded, and then your hard drive speeds will really matter. Even reading in and writing out files the hard drive speeds will be your limiting factor.

This implication of this is that you should do variable creation steps in as few datastep steps as possible, and minimize sorts, since reading and saving datasets will take a lot of time. This requires a real change in thinking from STATA, which is designed for changing one variable at a time on a rectangular file. Recall that STATA can do this efficiently since it usually starts by bringing the full dataset into memory before doing any changes. SAS does not do this, one of its strengths.

Learning to use DATA steps and PROC SQL is the central advantage of an experienced SAS programmer. Invest, and you will save time waiting for your programs to run.

Clean up your main hard drive if at all possible.

Otherwise you risk SAS crashing when your hard drive gets full. If it does, cancel the job and be sure to delete the temporary SAS datasets that may have been created before you crashed. The SAS default for storing temporary files is something like

C:\Users\”your_user_name”.AD\AppData\Local\Temp\SAS Temporary Files

Unless you have SAS currently open, you can safely delete all of the files stored in that directory. Ideally, there should be none since SAS deletes them when it closes normally. It is the abnormal endings of SAS that cause temporary files to be saved. Delete them, since they can be large!

Change the default hard drive for temporary files and sorting

If you have a large internal secondary hard drive with lots of space, then change the SAS settings so that it uses temp space on that drive for all work files and sorting operations.

To change this default location to a different internal hard drive, find your sasv9.cfg file which is in a location like

“C:\Program Files\SASHome\x86\SASFoundation\9.3\nls\en”

“C:\Program Files\SASHome2-94\SASFoundation\9.4\nls\en”

Find the line in the config firl that starts -WORK and change it to your own location for the temporary files (mine are on drive j and k) such as:

-WORK “k:\data\temp\SAS Temporary Files”

-UTILLOC “j:\data\temp\SAS Temporary Files”

The first one is where SAS stores its temporary work files such as WORK.ONE where you define the ONE such as by DATA ONE;

The second line is where SAS stores its own files such as when sorting a file or when saving residuals.

There is a reason to have the WORK and UTIL files on different drives, so that it is in generally reading in from one drive and writing out to a different one, rather than reading in and writing out on the same drive. Try to avoid the latter. Do some test on your own computer to see how much time you can save by switching from one drive to another instead of only using one drive.

Use only internal hard drives for routine programming

Very large files may require storage or back up on external hard drives, but these are incredibly slow. External drives are three to ten times slower than an internal hard drive. Try to minimize their use for actual project work. Instead, buy more internal drives if possible. You can purchase additional internal hard drives with 2T of space for under $100. You save that much in time the first day!

Always try to write large datasets to a different disk drive than you read them in from.

Do some tests copying large files from c: to c: and from C: to F: You may not notice any difference until the file sizes get truly huge, greater than your memory size.

Consider using binary compression to save space and time if you have a lot of binary variables.

By default, SAS stores datasets in  a fixed rectangular dataset that leaves lots of empty space when you use integers instead of real numbers. Although I have been a long time fan of using OPTIONS COMPRESS=YES to save space and run time (but not CPU time) I only recently discovered that

OPTIONS COMPRESS=BINARY;

is even better for integers and binary flags when they outnumber real numbers. For some large datasets with lots of zero one dummies it has reduced my file size by as much as 97%! Standard variables are stored as 8 bytes, which have 8*256=2048 bits. In principle you could store 2000 binary flags in the space of one real number. Try saving some files on different compression and see if your run times and storage space improve. Note: compression INCREASES files size for real numbers! It seems that compression saves space when binary flags outnumber real numbers or integers;

Try various permutations on the following on you computer with your actual data to see what saves time and space;

data real;           retain x1-x100 1234567.89101112; do i = 1 to 100000; output; end;run; proc means; run;

data dummies; retain d1-d100 1;                                do i = 1 to 100000; output; end; proc means; run;

*try various datasteps with this, using the same or different drives. Bump up the obs to see how times change.

 

Create a macro file where you store macros that you want to have available anytime you need them. Do the same with your formats;

options nosource;
%include “c://data/projectname/macrofiles”;
%include “c://data/projectname/allformats”;
options source;

Be aware of which SAS procs create large, intermediate files

Some but not all procs create huge temporary datasets.

Consider: PROC REG, and PROC GLM generates all of the results in one pass through the data unless you have an OUTPUT statement. Then they create large,uncompressed, temporary files that can be a multiple of your original file sizes. PROC SURVEYREG and MIXED create large intermediate files even without an output statement. Plan accordingly.

Consider using OUTEST=BETA to more efficiently create residuals together with PROC SCORE.

Compare two ways of making residuals;

*make test dataset with ten million obs, but trivial model;

data test;
do i = 1 to 10000000;
retain junk1-junk100 12345;  * it is carrying along all these extra variables that slows SAS down;
x = rannor(234567);
y = x+rannor(12345);
output;
end;

Run;    * 30.2 seconds);
*Straightforward way; Times on my computer shown following each step;
proc reg data = test;
y: model y = x;
output out=resid (keep=resid) residual=resid;
run;  *25 seconds;
proc means data = resid;
run;  *.3 seconds;

*total of the above two steps is 25.6 seconds;

proc reg data = test outest=beta ;
resid: model y = x;
run;                     *3.9 seconds;
proc print data = beta;
run;  *take a look at beta that is created;
proc score data=test score=beta type=parms
out=resid (keep=resid) residual;
var x;
run;       *6 seconds!;
proc means data = resid;
run;  .3 seconds;

*total from the second method is 10.3 seconds versus 25.6 on the direct approach PLUS no temporary files needed to be created that may crash the system.

If the model statement in both regressions is

y: model y = x junk1-junk100; *note that all of the junk has coefficients of zero, but SAS does not this going in;

then the two times are

Direct approach:    1:25.84
Scoring approach:  1:12.46 on regression plus 9.01 seconds on score = 1:21.47 which is a smaller savings

On very large files the time savings are even greater because of the reduced IO gains; SAS is still able to do this without writing onto the hard drive in this “small” sample on my computer. But the real savings is on temporary storage space.

Use a bell!

My latest addition to my macro list is the following bell macro, which makes sounds.

Use %bell; at the end of your SAS program that you run batch and you may notice when the program has finished running.

%macro bell;
*plays the trumpet call, useful to put at end of batch program to know when the batch file has ended;
*Randy Ellis and Wenjia Zhu November 18 2014;
data _null_;
call sound(392.00,70); *first argument is frequency, second is duration;
call sound(523.25,70);
call sound(659.25,70);
call sound(783.99,140);
call sound(659.25,70);
call sound(783.99,350);
run;
%mend;
%bell;

Purchase essential SAS programming guides.

I gave up on purchasing the paper copy of SAS manuals, because they take up more than two feet of shelf space, and are still not complete or up to date. I find the SAS help menus useful but clunky. I recommend the following if you are going to do serious SAS programming. Buy them used on Amazon or whatever. I would get an older edition, and it will cost less than $10 each. Really.

The Little SAS Book: A Primer, Fifth Edition (or an earlier one)

Nov 7, 2012

by Lora Delwiche and Susan Slaughter

Beginners introduction to SAS. Probably the best single book to buy when learning SAS.

 

Professional SAS Programmer’s Pocket Reference Paperback

By Rick Aster

http://www.amazon.com/Professional-SAS-Programmers-Pocket-Reference/dp/189195718X

Wonderful, concise summary of all of the main SAS commands, although you will have to already know SAS to find it useful. I use it to look up specific functions, macro commands, and optoins on various procs because it is faster than using the help menus. But I am old style…

Professional SAS Programming Shortcuts: Over 1,000 ways to improve your SAS programs Paperback

By Rick Aster

http://www.amazon.com/Professional-SAS-Programming-Shortcuts-programs/dp/1891957198/ref=sr_1_1?s=books&ie=UTF8&qid=1417616508&sr=1-1&keywords=professional+sas+programming+shortcuts

I don’t use this as much as the above, but if I had time, and were learning SAS instead of trying to rediscover things I already know, I would read through this carefully.

Get in the habit of deleting most intermediate permanent files

Delete files if either

1. You won’t need them again or

2. You can easily recreate them again.  *this latter point is usually true;

Beginner programmers tend to save too many intermediate files. Usually it is easier to rerun the entire program instead of saving the intermediate files. Give your final file of interest a name like MASTER or FULL_DATA then keep modifying it by adding variables instead of names like SORTED, STANDARDIZED,RESIDUAL,FITTED.

Consider a macro that helps make it easy to delete files.

%macro delete(library=work, data=temp, nolist=);

proc datasets library=&library &nolist;
delete &data;
run;
%mend;

*sample macro calls

%delete (data=temp);   *for temporary, work files you can also list multiple files names but these disappear anyway at the end of your run;

%delete (library =out, data = one two three) ; *for two node files in directory in;

%delete (library=out, data =one, nolist=nolist);   *Gets rid of list in output;

 

 

Ellis SAS tips for New SAS programmers

There is also a posting on Ellis SAS tips for Experienced SAS programmers

It focuses on issues when using large datasets.

 

Randy’s SAS hints for New SAS programmers, updated Feb 21, 2015

  1. ALWAYS

    begin and intermix your programs with internal documentation. (Note how I combined six forms of emphasis in ALWAYS: color, larger font, caps, bold, italics, underline.) Normally I recommend only one, but documenting your programs is really important. (Using only one form of emphasis is also important, just not really important.)

A simple example to start your program in SAS is

******************
* Program = test1, Randy Ellis, first version: March 8, 2013 – test program on sas features
***************;

Any comment starting with an asterisk and ending in a semicolon is ignored;

 

    1. Most common errors/causes of wasted time while programming in SAS.

a. Forgetting semicolons at the end of a line

b. Omitting a RUN statement, and then waiting for the program to run.

c. Unbalanced single or double quotes.

d. Unintentionally commenting out more code than you intend to.

e. Foolishly running a long program on a large dataset that has not first been tested on a tiny one.

f. Trying to print out a large dataset which will overflow memory or hard drive space.

g. Creating an infinite loop in a datastep; Here is one silly one. Usually they can be much harder to identify.

data infinite_loop;
x=1;
nevertrue=0;
do while x=1;
if nevertrue =1 then x=0;
end;
run;

h. There are many other common errors and causes of wasted time. I am sure you will find your own

 

  1. With big datasets, 99 % of the time it pays to use the following system OPTIONS:

 

options compress =yes nocenter;

or

options compress =binary nocenter;

binary compression works particularly well with many binary dummy variables and sometimes is spectacular in saving 95%+ on storage space and hence speed.

 

/* mostly use */
options nocenter /* SAS sometimes spends many seconds figuring out how to center large print outs of
data or results. */
ps=9999               /* avoid unneeded headers and page breaks that split up long tables in output */
ls=200;                /* some procs like PROC MEANS give less output if a narrow line size is used */
 

*other key options to consider;

Options obs = max   /* or obs=100, Max= no limit on maximum number of obs processed */
Nodate nonumber /* useful if you don’t want SAS to embed headers at top of each page in listing */
Macrogen     /* show the SAS code generated after running the Macros. */
Mprint   /* show how macro code and macro variables resolve */
nosource /* suppress source code from long log */
nonotes   /* be careful, but can be used to suppress notes from log for long macro loops */

;                       *remember to always end with a semicolon!;

 

  1. Use these three key procedures regularly

Proc contents data=test; run; /* shows a summary of the file similar to Stata’s DESCRIBE */
Proc means data = test (obs=100000); run; /* set a max obs if you don’t want this to take too long */
Proc print data = test (obs=10); run;

 

I recommend you create and use regularly a macro that does all three easily:

%macro cmp(data=test);
Proc Contents data=&data; Proc means data = &data (obs=1000); Proc print data = &data (obs=10); run;
%end;

Then do all three (contents, means, print ten obs) with just

%cmp(data = mydata);

 

  1. Understand temporary versus permanent files;

Data one;   creates a work.one temporary dataset that disappears when SAS terminates;

Data out.one; creates a permanent dataset in the out directory that remains even if SAS terminates;

 

Define libraries (or directories):

Libname out “c:/data/marketscan/output”;
Libname in “c:/data/marketscan/MSdata”;
 

 

Output or data can be written into external files:

Filename textdata “c:/data/marketscan/textdata.txt”;

 

  1. Run tests on small samples to develop programs and then Toogle between tiny and large samples when debugged.

A simple way is

Options obs =10;
*options obs = max; *only use this when you are sure your programs run.
 

OR, some procedures and data steps using End= dataset option do not work well on partial samples. For those I often toggle between two different input libraries. Create a subset image of all of your data in a separate directory and then toggle using the libname commands;

 

*Libname in ‘c:/data/projectdata/fulldata’;
Libname in ‘c:/data/projectdata/testsample’;

 

Time spent creating a test data set is time well spent.

You could even write a macro to make it easy. (I leave it as an exercise!)

 

  1. Use arrays abundantly. You can use different array names to reference the same set of variables. This is very convenient;

 

%let rhs=x1 x2 y1 y2 count more;
Data _null_;
Array X {100} X001-X100; *usual form;
Array y {100} ;                     * creates y1-y100;
Array xmat {10,10} X001-X100; *matrix notation allows two dimensional indexes;
Array XandY {*} X001-X100 y1-y100 index ; *useful when you don’t know the count of variables in advance;
Array allvar &rhs. ;     *implicit arrays can use implicit indexes;
 

*see various ways of initializing the array elements to zero;

Do i = 1 to 100; x{i} = 0; end;
 

Do i = 1 to dim(XandY); XandY{i} = 0; end;

 

Do over allvar; allvar = 0; end;   *sometimes this is very convenient;

 

Do i=1 to 100 while (y(i) = . );
y{i} = 0;   *do while and do until are sometimes useful;
end;

 

run;

  1. For some purposes naming variables in arrays using leading zeros improves sort order of variables

Use:
Array x {100} X001-X100;
not
Array x {100} X1-X100;

With the second, the alphabetically sorted variables are x1,x10,x100, x11, x12,..,x19, x2,x20 , etc.

 

  1. Learn Set versus Merge command (Update is for rare, specialized use)

 

Data three;   *information on the same person combined into a single record;
Merge ONE TWO;
BY IDNO;
Run;

 

  1. Learn key dataset options like

Obs=
Keep=
Drop=
In=
Firstobs=
Rename=(oldname=newname)
End=

 

  1. Keep files being sorted “skinny” by using drop or keep statements

Proc sort data = IN.BIG(keep=IDNO STATE COUNTY FROMDATE) out=out.bigsorted;
BY STATE COUNTY IDNO FROMDATE;
Run;

Also consider NODUP and NODUPKEY options to sort while dropping duplicate records, on all or on BY variables, respectively.

 

  1. Take advantage of BY group processing

Use FIRST.var and LAST.var abundantly.

 

USE special variables
_N_ = current observation counter
_ALL_ set of all variables such as Put _all_. Or when used with PROC CONTENTS, set of all datasets.

 

Also valuable is

PROC CONTENTS data = in._all_; run;

 

  1. Use lots of comments

 

* this is a standard SAS comment that ends with a semicolon;

 

/*   a PL1 style comment can comment out multiple lines including ordinary SAS comments;

* Like this; */

 

%macro junk; Macros can even comment out other macros or other pl1 style comments;

/*such as this; */ * O Boy!; %macro ignoreme;   mend; *very powerful;

 

%mend; * end macro junk;

 

  1. Use meaningful file names!

Data ONE TWO THREE can be useful.

 

  1. Put internal documentation about what the program does, who did it and when.
  2. Learn basic macro language; See SAS program demo for examples. Know the difference between executable and declarative statements used in DATA step

 

17. EXECUTABLE COMMANDS USED IN DATA STEP (Actually DO something, once for every record)

 

Y=y+x (assignment. In STATA you would use GEN y=x or REPLACE Y=X)
 
Do I = 1 to 10;
End; (always paired with DO, can be nested nearly unlimited deepness)

 

INFile in ‘c:/data/MSDATA/claimsdata.txt’;               define where input statements read from;
File out ‘c:/data/MSDATA/mergeddata.txt’;             define where put statements write to;

 

Goto johnny;      * always avoid. Use do groups instead;

 

IF a=b THEN y=0 ;
ELSE y=x; * be careful when multiple if statements;
CALL subroutine(); (Subroutines are OK, Macros are better)

 

INPUT   X ; (read in one line of X as text data from INFILE)
PUT   x y= / z date.; (Write out results to current LOG or FILE file)

 

MERGE IN.A IN.B ;
BY IDNO;         *   Match up with BY variable IDNO as you simultaneously read in A&B;

Both files must already be sorted by IDNO.

SET A B;                                           * read in order, first all of A, and then all of B;

UPDATE   A B; *replace variables with new values from B only if non missing in B;

 

OUTPUT out.A;      *Write out one obs to out.A SAS dataset;
OUTPUT;                *Writes out one obs of every output file being created;

DELETE;   * do not output this record, and return to the top of the datastep;

STOP;                               * ends the current SAS datastep;

 

18. Assignment commands for DATA Step are

only done once at the start of the data step

 

DATA ONE TWO IN.THREE;

*This would create three data sets, named ONE TWO and IN.THREE

Only the third one will be kept once SAS terminates.;

Array x {10} x01-x10;
ATTRIB x length =16 Abc length=$8;
RETAIN COUNT 0;
BY state county IDNO;
Also consider  
BY DESCENDING IDNO; or BY IDNO UNSORTED; if grouped but not sorted by IDNO;
DROP i;   * do not keep i in final data set, although it can still be used while the data step is running
KEEP IDNO AGE SEX; *this will drop all variables from output file except these three;
FORMAT x date.;   *permanently link the format DATE. To the variable link;

INFORMAT ABC $4.;

LABEL AGE2010 = “Age on December 31 2010”;
LENGTH x 8; *must be assigned the first time you reference the variable;
RENAME AGE = AGE2010; After this point you must use the newname (AGE2010);
OPTIONS NOBS=100; One of many options. Note done only once.

 

19. Key Systems language commands

LIBNAME to define libraries
FILENAME to define specific files, such as for text data to input or output text

TITLE THIS TITLE WILL APPEAR ON ALL OUTPUT IN LISTING until a new title line is given;

%INCLUDE

%LET year=2011;

%LET ABC = “Randy Ellis”;

 

20. Major procs you will want to master

DATA step !!!!! Counts as a procedure;

PROC CONTENTS

PROC PRINT

PROC MEANS

PROC SORT

PROC FREQ                      frequencies

PROC SUMMARY      (Can be done using MEANS, but easier)

PROC CORR (Can be done using Means or Summary)

PROC REG       OLS or GLS

PROC GLM   General Linear Models with automatically created fixed effects

PROC FORMAT /INFORMAT

PROC UNIVARIATE

PROC GENMOD nonlinear models

PROG SURVEYREG clustered errors

None of the above will execute unless a new PROC is started OR you include a RUN; statement.

21. Formats are very powerful. Here is an example from the MarketScan data. One use is to simply recode variables so that richer labels are possible.

 

Another use is to look up or merge on other information in large files.

 

Proc format;
value $region
1=’1-Northeast Region           ‘
2=’2-North Central Region       ‘
3=’3-South Region               ‘
4=’4-West Region               ‘
5=’5-Unknown Region             ‘
;

 

value $sex

1=‘1-Male           ‘
2=‘2-Female         ‘
other=‘ Missing/Unknown’

;

 

*Three different uses of formats;

Data one ;
sex=’1’;
region=1;
Label sex = ‘patient sex =1 if male’;
label region = census region;
run;

Proc print data = one;

Run;

 

data two;
set one;
Format sex $sex.; * permanently assigns sex format to this variable and stores format with the dataset;
Run;

Proc print data = two;
Run;

Proc contents data = two;
Run;

*be careful if the format is very long!;

 

Data three;
Set one;
Charsex=put(sex,$sex.);
Run;

*maps sex into the label, and saves a new variable as the text strings. Be careful can be very long;

Proc print data =three;
Run;

 

Proc print data = one;
Format sex $sex.;
*this is almost always the best way to use formats: Only on your results of procs, not saved as part of the datasets;
Run;

 

If you are trying to learn SAS on your own, then I recommend you buy:

The Little SAS Book: A Primer, Fifth Edition (or an earlier one)

Nov 7, 2012

by Lora Delwiche and Susan Slaughter

Beginners introduction to SAS. Probably the best single book to buy when learning SAS.

Deflategate pressure drop is consistent with a ball air temperature of 72 degrees when tested initially.

Deflategate pressure drop is consistent with a ball air temperature of 72 degrees when tested initially.

I revised my original Deflategate posting after learning that it is absolute air pressure not pressure above standard sea level pressure that follows the Ideal Gas Law.  I also allowed for stretching of the leather once the ball becomes wet. And for the possibility that the cold rain was was colder (45 degrees F) below the recorded air temperature at 53 degrees F.  Together these adjustments make it even easier for the weather to fully explain the drop in ball pressure.

My Bottom Line: The NFL owes the Patriot Nation and Bob Kraft a big apology.

Correction #1: My initial use of the ideal gas formula did not recognize that it is absolute pressure, not pressure above the ambient air pressure that matters. Hence a ball with a pressure of 12.5 PSI is actually 12.5 PSI above the surrounding air pressure, which is about 14 PSI at sea level. So a decline from 12.5 PSI to 10.5 PSI is actually only an 8.2 percent decline in absolute pressure from 26.5 to 24.5 PSI. This makes it much easier for temperature changes to explain the difference in ball pressure. Only an 8.2 percent change in absolute temperature (approximately a 42 degree Fahrenheit drop) would be required it that were the only change needed.

Correction #2: It is well established that water allows leather to stretch. I found one site that noted that water can allow leather to stretch by 2-5% when wet.  It does not specify how much force is needed to achieve this.

https://answers.yahoo.com/question/index;_ylt=A0LEVvwgfs9UP0AAr40nnIlQ?qid=20060908234923AAxt7xP

It is plausible that a new ball made of leather under pressure (scuffed up to let in the moisture quickly)  might stretch 1 percent upon getting wet (such as in the rain). Since volume goes up with the cube of this stretching, this would be a (1.01)^3 -1= 3 percent increase in ball volume or decline in pressure. This amount would reduces the absolute temperature difference needed for the 2 PSI drop to only 5.2 percent (a change of only 27 degrees F.)

Correction #3: It was raining on game day, and the rain was probably much colder than the outside air temperature. So it is plausible that the game ball was as cold as 45 degrees Fahrenheit at game time when the low ball pressures were detected. This makes even lower initial testing temperatures consistent with the professed levels of underinflation.

A single formula can be used to calculate the ball temperature needed when tested initially to explain a ball pressure detected during the game that is 2 PSI lower, after getting colder (to 45 degrees F), .004 smaller (since ball volume shrinks when cold), and stretched 1% due to rain. It would be

Pregame testing temperature in F =(pressure change as a ratio)/(volume change due to cold)/(volume change due to leather stretching 1% when wet)*(45 degree ball temperature during game+460 degrees) – 460 degrees

(12.5+14)/(10.5+14)/(.996)/(1.01^3)(45+460) – 460 = 72 degrees Fahrenheit

Given this math, it would have been surprising if the ball pressure had NOT declined significantly.

Final comment #1: All of these calculations and hypotheses can be tested empirically. See the empirical analysis done by Headsmart Labs (http://www.headsmartlabs.com). They find that a rain plus a 25 degree drop is consistent with a 1.82 PSI decrease.

Final comment #2: Since the original game balls were reinflated by officials during halftime, the true ball pressures during the first half will never be known. Moreover there seems to be no documentary record of their pressures at the time they were re-inflated.

The XLIX Superbowl was a terrific game from the point of view of Patriots fans. Now it is time for the NFL  to own up to its own mistake in accusing the Patriots of cheating.  It was just a matter of physics.

Revised calculations

 

Various combinations of testing temperatures and PSI
A B C D E F G H I J K L M N O
Adjustments for temperature only, correcting for absolute pressure at 14 PSI at sea level Adjustments for changes in ball volume Adjusting for temperature and football volume
Temperature F Degrees above Absolute zero Temperature adjustment Various game time or testing PSI readings surface area sphere radius mean football radius volume Volume adjustment Various game time or testing PSI readings
Game time temperature 45 505 1.000 10.5 11 11.5 189 3.8782 3.81183 232 1.000 10.5 11 11.5
60 520 1.030 11.2 11.7 12.3 189.2427 3.8807 3.81427 232.447 0.998 11.3 11.8 12.3
70 530 1.050 11.7 12.2 12.8 189.4045 3.8824 3.81590 232.7451 0.997 11.8 12.3 12.8
Possibl e testing temperatures 80 540 1.069 12.2 12.7 13.3 189.5663 3.8840 3.81753 233.0434 0.996 12.3 12.9 13.4
90 550 1.089 12.7 13.2 13.8 189.7280 3.8857 3.81916 233.3418 0.994 12.8 13.4 13.9
100 560 1.109 13.2 13.7 14.3 189.8898 3.8873 3.82079 233.6403 0.993 13.4 13.9 14.5
110 570 1.129 13.7 14.2 14.8 190.0516 3.8890 3.82242 233.939 0.992 13.9 14.5 15.0
120 580 1.149 14.1 14.7 15.3 190.2134 3.8906 3.82404 234.2378 0.990 14.4 15.0 15.6
130 590 1.168 14.6 15.2 15.8 190.3752 3.8923 3.82567 234.5367 0.989 14.9 15.5 16.1
140 600 1.188 15.1 15.7 16.3 190.5370 3.8940 3.82730 234.8357 0.988 15.5 16.1 16.7
150 610 1.208 15.6 16.2 16.8 190.6988 3.8956 3.82892 235.1349 0.987 16.0 16.6 17.2
160 620 1.228 16.1 16.7 17.3 190.8606 3.8973 3.83054 235.4342 0.985 16.5 17.1 17.8
Temperature (Fo) at which ball would pass test. 2 PSI diff 1.5 PSI diff 1 PSI diff 88 77 67
Temperature only 86 75 65
Temperature and volume change from temp 88 77 67
temp, volume, and stretching from wetness 72 62 51
Last row calculated as (12.5+14)/(inferred test level+14)/(0.996)/(1.01^3)*(45+460)-460
Notes
Revised calculations allow for sea level temperature to be 14 PSI, so a change from 10.5 to 12.5 PSI (above this level requires only a (12.5+14)/(10.5+14)-1=8.2 percent change in absolute temperature.
See notes at the top, but final calculations also allow for the possiblities that ball temperature was 45 degrees, not 53 due to cold rain, and 1% stretching in leather due to rain.
Fields in first row and first column are input parameters, others are calculated

 

Original post

There is no mention of the temperature at which the footballs need to be stored or tested in the official NFL rule book. (Sloppy rules!)

The process of scuffing up the new balls to make them feel better no doubt warms them up. It would be impossible for it to be otherwise. An empirical question is how much did it warm them up and what temperature were they when tested?

Surface temps could have been below their internal temperature of the air, which is what matters for the pressure. Leather is a pretty good insulator (hence its use in many coats).

Anyone who took high school physics may remember that pressure and temperature satisfy

PV=nRT

Pressure*Volume=Number of moles*ideal gas constant*Temperature  (Ideal Gas Law)

Temperature needs to be measured in degrees above absolute zero, which is -459.67 Fahrenheit (sorry metric readers!). The temperature at game time was 53 degrees. So the right question to ask is:At what temperature,  T1, would the air in the ball have to be at the time the balls were tested such that once they cooled down to T0=53 degrees they measures two pounds per square inch (PSI) below the allowed minimum?

The lowest allowed temperature for testing was 12.5 PSI. We are told only vaguely that the balls were 2 PSI lower than this, but this is not a precise number. It could be it was rounded from 1.501 PSI. that would mean they  might have been 11 pounds PSI when tested during the game.  I examine 10.5, 11 and 11.5 as possible game time test PSI levels.The following tables shows possible combinations of game time testing temperature and half-time testing temperatures that would be consistent with various pressures.The right hand side of the table makes an adjustment for the fact that the leather/rubber in the ball would also have shrunk as the ball cooled down, which works against the temperature.Using the formulaPSI1=PSI0*((T1+459.67)/(T0+459.67). (See correction above!) Ignoring the volume change of the ball, it is straightforward to solve for what initial temperature the balls would have had to be for the observed game time temperatures.

Adjusting for a plausible guess at the small amount that the leather plus rubber bladder would have also changed makes only a small difference.

For a 1.5 PSI difference from testing to halftime , the air inside of them would have had to be at about 128 degrees at the time they were tested. (The leather skin could have been a lower temperature.) This would have made them feel warm but not burning hot to the hand.

Allowing the balls to be warm when tested is sneaky or perhaps accidental, but not cheating.

Go Pats!

Various combinations of testing temperatures and PSI
A B C D E F G H I J K L M N O
Adjustments for temperature only Adjustments for changes in ball volume Adjusting for temperature and football volume
Temperature F Degrees above Absolute zero Temperature adjustment Various game time or testing PSI readings surface area sphere radius mean football radius volume Volume adjustment Various game time or testing PSI readings
Game time temperature 53 512.67 1.000 10.5 11 11.5 189 3.8782 3.81183 232 1.000 10.5 11 11.5
Possibl e testing temperatures 80 539.67 1.053 11.1 11.6 12.1 189.4368 3.8827 3.81623 232.8048 1.003 11.0 11.5 12.1
90 549.67 1.072 11.3 11.8 12.3 189.5986 3.8844 3.81786 233.1031 1.005 11.2 11.7 12.3
100 559.67 1.092 11.5 12.0 12.6 189.7604 3.8860 3.81949 233.4015 1.006 11.4 11.9 12.5
110 569.67 1.111 11.7 12.2 12.8 189.9222 3.8877 3.82112 233.7001 1.007 11.6 12.1 12.7
120 579.67 1.131 11.9 12.4 13.0 190.0840 3.8893 3.82274 233.9988 1.009 11.8 12.3 12.9
130 589.67 1.150 12.1 12.7 13.2 190.2458 3.8910 3.82437 234.2976 1.010 12.0 12.5 13.1
140 599.67 1.170 12.3 12.9 13.5 190.4076 3.8926 3.82600 234.5965 1.011 12.1 12.7 13.3
150 609.67 1.189 12.5 13.1 13.7 190.5693 3.8943 3.82762 234.8956 1.012 12.3 12.9 13.5
160 619.67 1.209 12.7 13.3 13.9 190.7311 3.8959 3.82924 235.1948 1.014 12.5 13.1 13.7
Temperature (Fo) at which ball would pass test. 151 123 98 159 128 101
Notes
Fields in yellow are input parameters, others are calculated
Column C is temperature minus absolute zero
Column D is the ratio of column C to the game time temp in absolute degrees and shows how much higher PSI would have been than at game time.
Columns E through G show possible testing PSI for three possible game time PSI levels.
Columns H through L show adjustments to volume which tend to reduce the PSI as a ball is heated. Calculations use rate of expansion of hard rubber per square inch per degree.
Columns M through O show Balll PSI after adjusting for both air temperature and football volume
Parameters and formulas
absolute zero= -459.67 fahrenheit
hard rubber expansion 42.8 (10-6 in/(in oF))*) http://www.engineeringtoolbox.com/linear-expansion-coefficients-d_95.html
or 0.0000428 Used for column I expansion of surface area
Surface area assume to grow with the square of this proportion with temperature.
The approximate volume and surface area of a standard football are 232 cubic inches and 189 square inches, respectively.
http://www.answers.com/Q/Volume_and_surface_area_of_a_football
Surface of a sphere formula
4pr2 Used to calculate radius of sphere
volume of sphere formula
4/3*pi*radius3 Used to calculate volume of football. Volume adjusted downward by a fixed proportion because footballs are not spheres.

 

NFL rules

Rule 2 The BallSection 1BALL DIMENSIONSThe Ball must be a “Wilson,” hand selected, bearingthe signature of the Commissioner of the League, Roger Goodell.The ball shall be made up of an inflated (12 1/2 to 13 1/2 pounds) urethane bladder enclosed in a pebble grained, leather case(natural tan color) without corrugations of any kind. It shall have the form of a prolate spheroid and the size and weightshall be: long axis, 11 to 11 1/4 inches; long circumference, 28 to 28 1/2 inches; short circumference, 21 to 21 1/4 inches;weight, 14 to 15 ounces.The Referee shall be the sole judge as to whether all balls offered for play comply with these specifications. A pump is to befurnished by the home club, and the balls shall remain under the supervision of the Referee until they are delivered to theball attendant just prior to the start of the game.

From the Free Dictionaryideal gas lawn.A physical law describing the relationship of the measurable properties of an ideal gas, where P (pressure) × V (volume) = n (number of moles) × R (the gas constant) × T (temperature in Kelvin). It is derived from a combination of the gas laws of Boyle, Charles, and Avogadro. Also called universal gas law.

 

Useful Data Links to US Government data

Websites for Federal Administrative Data sets:

US Administration for International Development:
Foreign aid from the U.S: Data and Tools

Department of Agriculture:
Economic Research Services: Supplemental Nutrition Assistance Program (SNAP) Data System
Food and Nutrition Services: Commodity Supplemental Food Program Data
Food Safety Inspection Services: Recalls and Quarterly Enforcement Reports
Forest Inventory Data
National Agricultural Statistics Service: Cropland Data
Natural Resource Conservation Service: Conservation Financial Assistance Programs’ Enrollment Data
Risk Management Agency (RMA): Program Costs and Outlays Data
RMA: Actuarial Data
Web Based Supply Chain Management Reports Data

US Army:
Army Corps of Engineers: U.S. Waterborne Commerce Data

Department of Commerce:
Bureau of Economic Analysis (BEA): Foreign Direct Investments Data in the US
BEA: US National Income and Product Account (NIPA) Data
Economic Development Administration: Program Data
Census: Business Register Data and Longitudinal Business Database
Census: Longitudinal Employer-Household Dynamics
Census: County and Zip Code Business Patterns
International Trade Administration (ITA): U.S. Exporting Companies Data
ITA: Export-Supported Employment Data
ITA: Visitors Arrivals Program (Form I-94) Data
ITA: International Air Travel Statistics ( Form I-92) Program Data
National Climate Data Center: National climate and historic weather data
National Marine Fisheries Service: Recreational Fisheries statistics or Commercial Fisheries Statistics

Commodities Futures Trading Commission:
Filings, transactions, and other data
Market Report Data

Consumer Financial Protection Bureau:
Credit Card Agreement Database
Consumer Complaint Database

Consumer Product Safety Commission:
Injury Statistics

Department of Education:
Civil Rights Data for Public Schools
EDFacts Data for K-12 Educational Programs
National Center for Education Statistics: Common Core of Data on Public School
Federal Student Aid Data
National Reporting System Data for Adult Education
Nation’s Report Card System Data

Department of Energy:
Energy Information Administration (EIA): Energy Prices Data
EIA: Renewable Energy Market Data
EIA: Crude Oil Production and Stocks Data

Environmental Protection Agency:
Air Quality Data
Enforcement Dockets data
National Pollutant Discharge Elimination System (NPDES) permits and compliance data
Toxic Substances Control Act Chemical Substance Inventory
Superfund Sites (CERCLIS database)

Equal Employment Opportunity Commission
Enforcement and Litigation Statistics on Employment Discrimination

Federal Court System:
Bankruptcy Statistics

Federal Deposit Insurance Corporation:
Industry Data
Failed Bank Data

Federal Emergency Management Agency:
Assistance Record Data

Federal Financial Institutions Examination Council:
Financial and Structural Data for FDIC-insured Institution
Home mortgage loans data
Reinvestment Act Data

The Federal Reserve:
Consumer Credit data
Finance Companies Data
Foreign exchange rates
Government Receipts for Expenditures and Investments
Money Stock Measures
Treasury Account Series data

Federal Trade Commission:
Fraud and Identity Theft aggregates (Consumer Sentinel Network)

Fish and Wildlife Services:
Wetlands Data

General Services Administration:
Federal Procurement Report Data
FFATA Sub-award Reporting System (Data Reporting)
Small Business Goaling Report

Department of Health and Human Services:
Agency for Substances and Disease Registry (ASTDR): Environmental Health Webmap Data
ASTDR: Hazardous Substances Emergency Events Surveillance Report Data
ASTDR: National Toxic Substance Incidents Program Data
Center for Disease Control and Prevention (CDC): Community Water Fluoridation Statistics 
CDC: National Program of Cancer Registries Data
CDC: Surveillance Data
Center for Medicare and Medicaid Services (CMS): Medicare Claims Data or Microdata
CMS: National Health Expenditures Data
CMS: Provider of Service Data
National Directory of New Hires Data
National Center for Health Statistics: Vital Statistics: Births, Deaths, Marriages, Divorces
Temporary Assistance to Needy Families Administrative Records

Department of Homeland Security:
Immigration Statistics

Department of Housing and Urban Development:
Community Development Block Grants Expenditures Data
Family Data on Public and Indian Housing and Microdata
Fair Market Rents Data
Government Sponsored Enterprise Data
Metropolitan Area Quarterly Residential and Business Vacancy Report Data
National Low Income Housing Tax Credit Database
Neighborhood Stabilization Program Data
Program Income Limits Data

Department of Interior:
US Geological Survey (USGS): Biodiversity, Species data
USGS: Land Cover and Land Use data
USGS: Water Resources data
USGS: Water Quality Data

International Trade Commission:
Tariffs Databases

Department of Justice:
Bureau of Prison: Inmate, Population, and Staff Statistics
Bureau of Justice Statistics(BJS): Court Statistics Project Data
BJS: Federal Justice Statistics Program Data
BJS: Law Enforcement Management and Administrative Statistics
BJS: National Corrections Reporting Program Data
BJS: National Incident-Based Reporting System Data
BJS: National Prisoner Statistics Program Data
Federal Bureau of Investigation: Uniform Crime Reports Data

Department of Labor:
Bureau of Labor Statistics: Quarterly Census of Employment and Wages
Foreign Labor Certification Office: H-1B Data
Labor Retirement and Welfare Benefit Plan Data Set
(Form 5500)
Occupational Safety and Health Administration (OSHA): Work-Related Injury or Illness Data
OSHA: Enforcement Data (Inspection Data)
OSHA: Worker Fatalities/Catastrophes Report (FAT/CAT) 

National Aeronautics and Space Administration:
Urban Landsat

Patent and Trademark Office:
U.S. Patent and Trademark Office patent data

Department of Transportation:
Bureau of Transportation (BTS): Air Carrier Statistics
BTS: Intermodal Passenger Connectivity Database
Maritime Administration: Maritime Travel and Transportation Statistics

Department of Treasury:
Bureau of Fiscal Service: Public Debt Report
Financial Crime Enforcement Network: Mortgage and Real Estate Fraud Data Set
Interest Rate Statistics
Internal Revenue Service (IRS): Corporate Tax Statistics (Form 1120)
IRS: Employee Benefit Plans (Form 5500)
IRS: Individual Tax Statistics (Form 1040)
IRS: Quarterly Payroll Taxes (Form 941)

Securities and Exchange Commission:
Filings
Mutual Fund Fees and Expenses
Program and Market Data
Short Sale Volume Data 

Small Business Administration:
Small Business Lender and Loan Data
Social Security Administration:
Social Security Programs Data
Earnings and Employment Data for Workers Covered under Social Security and Medicare

Department of Veteran’s Affairs:
Veterans Benefits Administration Reports
National Pollutant Discharge Elimination System (NPDES) permits and compliance data

Websites for Agency Procedures on Access to Restricted-Use Administrative Data Sets:

Bureau of Labor Statistics Confidential Data Sets Access
Census Bureau Restricted Restricted Data Sets Access
Agency for Healthcare Research and Quality Restricted Use Data Access
National Center for Health Statistics Restricted Use Data Access
National Center for Education Statistics Restricted Use Data Licenses
Bureau of Transportation Statistics Restricted-Release Airline Data Access
USDA’s Economic Research Service Agriculture Resource Management Survey Data Access
National Institute on Aging Restricted Data Access
Center for Medicare and Medicaid Limited Data Access
Social Security Administration Health and Retirement Study Data Access
National Science Foundation/National Center for Science and Engineering Statistics Restricted-Use Data Access
Substance Abuse and Mental Health Data Archive

BU well represented at ASSA meetings in 2015

As would be expected since the Allied Social Science Association (ASSA) meetings are in Boston this January, BU is well represented on the ASSA program. After searching and scanning through the program for current and former students and current faculty, I identified 81 BU affiliate names on the program, whether as authors, discussants or presiding. This includes 17 of our regular BU economics faculty. The full list with affiliations is shown below. Note that his reflects not only economics department members, but also SMG, SPH, Political Science, Law or whatever may be the current affiliation.

In 2014, when the meetings were in Philadelphia, there were 70 BU affiliates participating.
This count is almost certainly an undercount, since recognizing the names of BU alumni is imprecise. I apologize for missing some names.

If one restricts the count to only names of current BU affiliates then there are 57 names affiliated with BU, which is ahead of BC (32) and Brown (21) but well behind our neighbors of Harvard (270) and MIT (152). We seem to rank about 15th. We still have a ways to go!

As usual (always?) there will be a BU reception at the meetings. This year it will be Sunday January 4 6-8 p.m. in the Westin Hotel. Look in the program for the exact room.

It is not too early to plan on submiting for the next ASSA meetings:

January 3-5, 2016 (Sunday, Monday & Tuesday) San Francisco, CA Hilton San Francisco

Preliminary Program for 2015 is linked here.

https://www.aeaweb.org/Annual_Meeting/index.php

BU affiliates, with duplicate names signifying each role time a name appears on the program.:

Ahmed Galal      Economic Research Forum and former Finance Minister of Egypt
Alfredo Burlando             University of Oregon
Alisdair McKay   Boston University
Andrew F. Newman       Boston University and CEPR
Andrew F. Newman       Boston University and CEPR
Angela Dills         Providence College
Angela Dills         Providence College
Angela Dills         Providence College
Austin Frakt        Boston University
Berardino Palazzo            Boston University
Berardino Palazzo            Boston University
Berardino Palazzo            Boston University
Carola Frydman                Boston University
Carola Frydman                Northwestern University
Cathie Jo Martin               Boston University
Ching-to Albert Ma         Boston University
Claudia Olivetti  Boston University
Claudia Olivetti Boston University
Daniele Paserman           Boston University
Dara Lee Luca    Harvard University and University of Missouri
Dara Lee Luca    University of Missouri and Harvard University
Dirk Hackbarth Boston University
Evgeny Lyandres              Boston University
Francesco Decarolis        Boston University
Giorgos Zervas Boston University
Giulia La Mattina              University of South Florida
Gustavo Schwenkler      Boston University
Hiroaki Kaido      Boston University
Ivan Fernandez-Val         Boston University
Jae W. Sim          Federal Reserve Board
James Rebitzer                 Boston University
Jerome Detemple           Boston University
Jianjun Miao      Boston University
Jing Guo               American Institutes for Research
Julie Shi                Harvard University
Julie Shi                Harvard University
Julie Shi                Harvard University
Kathleen Carey                 Boston University
Kathleen Carey                 Boston University
Kehinde Ajayi    Boston University
Kehinde Ajayi    Boston University
Kehinde Ajayi    Boston University
Kehinde Ajayi    Boston University
Keith Marzilli Ericson       Boston University
Keith Marzilli Ericson       Boston University
Kevin Gallagher                Boston University
Kevin Gallagher                Boston University
Kevin Lang          Boston University
Kevin Lang          Boston University
Koichiro Ito         Boston University
Koichiro Ito         Boston University
Koichiro Ito         Boston University
Kristopher Gerardi          Federal Reserve Bank of Atlanta
Kristopher Gerardi          Federal Reserve Bank of Atlanta
Leslie Boden      Boston University
Marc Rysman     Boston University
Marc Rysman     Boston University
Marc Rysman     Boston University
Marcel Rindisbacher       Boston University
Martha Starr      American University
Megan MacGarvie           Boston University
Pasquale Schiraldi            London School of Economics
Pasquale Schiraldi            London School of Economics
Phillip H. Ross    Boston University
Randall Ellis         Boston University
Robert Margo    Boston University
Rodolfo Prieto   Boston University
Rui Albuquerque              Boston University
Samuel Bazzi      Boston University
Sean Horan         Université de Montréal
Shinsuke Tanaka              Tufts University
Shinsuke Tanaka              Tufts University
Shulamit Kahn   Boston University
Silvia Prina           Case Western Reserve University
Simon Gilchrist Boston University
Simon Gilchrist Boston University
Stefania Garetto              Boston University
Timothy Layton                 Boston University
Yorghos Tripodis               Boston University
Yuan Tian             Boston University
Yuping Tsai          Centers for Disease Control and Prevention

 

Employer Sponsored Insurance Also Surged in MA in 2007.

There has been a great deal of surprise expressed in the media over the RAND’s latest report suggesting that more people have become insured through employer sponsored insurance (ESI) than through either Medicaid or the Exchanges under the ACA. One example is Adrianna McIntyre on The Incidental Economist who posted on Wednesday:

“I can’t overstate how stunning this finding is if it’s true; CBO expected that ESI gains and losses would pretty much break even in 2014 and that employer coverage would decline modestly in future years (p. 108).”

This result is precisely NOT stunning if you study the Massachusetts health reform.
In Massachusetts the expansion in ESI coverage ALSO led the total increase during the first year and half. Below is a  table summarizing the early returns in MA from a Massachusetts Division of Health Care Finance and Policy study in 2011.

http://www.mass.gov/chia/docs/r/pubs/11/2011-key-indicators-may.pdf

Notice how growth in ESI dominated both Medicaid and the Exchange in the first two years, before being surpassed by these other two.

I speculate that part of the reason so many Massachusetts employers dropped their plans in 2010 was because they knew they were not
compliant with the ACA new higher standard, but that is speculation. There was also a serious recession that affected employment and enrollment.

Massachusetts Health Reform http://www.mass.gov/chia/docs/r/pubs/11/2011-key-indicators-may.pdf
Insured Population by Insurance Types, 2006-2010
Excluding Medicare
Insured Population by Insurance Type, 2006-2010
June 30 2006 Dec 31 2006 Dec 31 2007 Dec 31 2008 Dec 31 2009 Dec 31 2010
Private Group 4,333,014 4,395,136 4,457,157 4,474,466 4,358,867 4,315,040
Individual Purchase 40,184 38,718 65,465 81,073 114,668 117,514
MassHealth 705,179 740,663 764,559 780,727 848,528 898,572
Commonwealth Care 0 18,327 158,194 162,725 150,998 158,973
Total Members 5,078,377 5,192,814 5,445,375 5,498,991 5,473,061 5,490,099
Change since 6/30/2006 June 30 2006 Dec 31 2006 Dec 31 2007 Dec 31 2008 Dec 31 2009 Dec 31 2010
Private Group 62,122 124,143 141,452 25,853 -17,974
Individual Purchase -1,466 25,281 40,889 74,484 77,330
MassHealth 35,484 59,380 75,548 143,349 193,393
Commonwealth Care 18,327 158,194 162,725 150,998 158,973
Total Members 114,437 366,998 420,614 394,684 411,722
Distribution of new enrollment as a fraction of total gains June 30 2006 Dec 31 2006 Dec 31 2007 Dec 31 2008 Dec 31 2009 Dec 31 2010
Private Group 54% 34% 34% 7% -4%
Individual Purchase -1% 7% 10% 19% 19%
MassHealth 31% 16% 18% 36% 47%
Commonwealth Care 16% 43% 39% 38% 39%
Total Members 100% 100% 100% 100% 100%