An Exploration of the Public Library System in the United States¶

An Analysis of the Public Library System in Python¶

Introduction¶

Some of my earliest memories as a child revolve around going to the public library in my hometown. Born into a family of readers, I remember going every week to pick up a new stack of as many picture books as a library card would allow and tearing through them, sometimes alone, more often not. Sunday afternoons would often include my parents, my brother and I sitting together in the living room, each with a book in hand.

The impact that this had on me as a child is hard to quantify, but I strongly believe that this weekly (sometimes more often) pilgrimage to the library had lasting effects on my education and lead to a life-long love of reading.

Fast forward 12 years, as a freshman in college, I discovered another passion: data science. Typing strange words into a computer to reveal hidden patterns seemed to me not a contradictory hobby to reading and research, but a complementary one. Instead of synthesizing information from a book, I was synthesizing graphs and linear regressions. While the world sees these as opposite ends of some 'right brain/left brain' spectrum, I always found the methodology more similar than different.

As I progressed in my data science journey, I often heard one thing repeated over and over again:

"You can't just say you can do data science, you need to be able to prove it."¶

As I considered this, I tried to find something I was interested in to work on. I don't find sports very interesting and, given the number of poliical-centered thinkpieces, found that realm overdone. So I kept coming back to the thing that has remained true since I was a child: the importance of public libraries.

"Libraries are a cornerstone of democracy—where information is free and equally available to everyone. People tend to take that for granted, and they don’t realize what is at stake when that is put at risk." —Carla Hayden¶

This begins my new found goal, the beginnings of which are realized here: pushing for libraries not just with a public-service argument but with the most compelling one I can make: a data-centric one. The journey that follows here focuses on exploring the national library system and what it means to quantify a library's impact.

Data¶

The dataset I am focusing on here is a 2016 survey of US public libraries conducted by the Institute of Museum and Library Services. The data is extensive, broken into three files, two of which I use in this analysis. The first is broken down by state: 53 entries for the 50 states, District of Columbia, American Samoa and Guam (the outlying areas of Northern Mariana Islands, Puerto Rico, and U.S. Virgin Islands did not participate). The second file is even more detailed, with an entry for each main library. Of 9,234 libraries surveyed, 9,024 libraries responded to the survey, a response rate of 97.7 percent.

Those interested in accessing this data can find it here.

As this analysis focuses on the 50 US states, I removed the data from DC, American Samoa and Guam.

Exploration¶

I used a host of libraries through the exploration, beginning with:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns  
import folium
import os

Then, a high level overview of the data

state_df = pd.read_csv('PLS_FY2016_State_pusum16a.csv')
state_df = state_df.drop([3, 8, 12])

state_df.head()

There are 114 columns listed in the data, which we will define as we use them.

The first aspect of the data I was curious about what the number of visits at each state's libraries

state_df.nlargest(5, 'VISITS')

state_geo = os.path.join('', 'us-states.json')
m = folium.Map(location=[37, -102], zoom_start=4)

m.choropleth(
 geo_data=state_geo,
 name='choropleth1',
 data=state_df,
 columns=['STABR', 'VISITS'],
 key_on='feature.id',
 fill_color='BuGn',
 fill_opacity=0.7,
 line_opacity=0.2,
 legend_name='Number of Visits'
)
folium.LayerControl().add_to(m)
 
m

Of course this data is influenced by the population of each of these states, but some stuck out immediately: New York and Ohio rank in the top 5 above Florida and Texas, both of which have much higher populations. In order to mitigate this influence, I graphed this in terms of visit per person in the "LSA" or legal service area

import warnings

warnings.simplefilter(action= 'ignore', category = FutureWarning)

mapByPop = {"STABR": state_df["STABR"], "Per100": (state_df['VISITS'] / state_df['POPU_LSA']) *100}
mapByPop = pd.DataFrame(data=mapByPop)

mapByPop.nlargest(10, "Per100")

m = folium.Map(location=[37, -102], zoom_start=4)

m.choropleth(
 geo_data=state_geo,
 name='choropleth',
 data=mapByPop,
 columns=['STABR', 'Per100'],
 key_on='feature.id',
 fill_color='BuGn',
 fill_opacity=0.7,
 line_opacity=0.2,
 legend_name='Visits per 100 residents'
)
folium.LayerControl().add_to(m)
 
m

As you can see, the number of visits varies wildly per state. One potential factor that could be influencing this is how far each of these people live and work from their closest library. However, even in more stereotypically rural states like Wyoming, visitation is strong.

This became the point of interest in my exploration: figuring out what was driving this variability in the number of visits per 100 people. I assumed that most of these visits were being done by a section of the population who visited frequently. Thus, the first question I had was whether there were simply more registered users of the libraries in these states.

mapByReg = {"STABR": state_df["STABR"], "Per100": (state_df["REGBOR"] / state_df['POPU_LSA']) *100}
mapByReg = pd.DataFrame(data=mapByReg)

mapByReg.nlargest(10, "Per100")

m = folium.Map(location=[37, -102], zoom_start=4)

m.choropleth(
 geo_data=state_geo,
 name='choropleth',
 data=mapByReg,
 columns=['STABR', 'Per100'],
 key_on='feature.id',
 fill_color='BuGn',
 fill_opacity=0.7,
 line_opacity=0.2,
 legend_name='Registered Users per 100 residents'
)
folium.LayerControl().add_to(m)
 
m

sns.set(style="darkgrid")

g = sns.jointplot(mapByPop["Per100"],mapByReg["Per100"], kind="reg",
                  xlim=(0, 1000), ylim=(0, 100), color="m", height=7)
g.set_axis_labels('Visits per 100 Residents', 'Registered Users per 100 Residents', fontsize=16)

<seaborn.axisgrid.JointGrid at 0x1c257a3588>

The percentage of registered users varies widely by state, from just under 30% to just over 75%. How well can this explain the differences in the number of visits?

import statsmodels.api as sm
x = mapByReg["Per100"]
y = mapByPop["Per100"]

# Note the difference in argument order
model = sm.OLS(y, x).fit()

# Print out the statistics
model.summary()

The R-squared from a least squares regression model, calculated here, is just under 95%. This signifies that 95% of the variability of the number of visits can be correlated with the number of registered users.

Going Deeper¶

This seems intuitive: the more registered users you have, the more visits you have. But what's driving the number of registered users? One of the first things I thought of was the possibility that some libraries are keeping up with technology better than others (I personally began utilizing library's services at a higher rate after buying a Kindle. With the ability to download ebooks from the library, it eliminated the hassle of going in person for every book).

First, let's find the portion of material that's generally categorized as 'electronic'

d = {'STABR': state_df['STABR'], 'BKPER': state_df['BKVOL']/state_df['TOTCIR'], 'EBOOKPER': state_df['EBOOK']/state_df['TOTCIR'], 'AUDIOPER': (state_df['AUDIO_PH']+state_df['AUDIO_DL'])/state_df['TOTCIR'], 'VIDEOPER': (state_df['VIDEO_PH']+state_df['VIDEO_DL'])/state_df['TOTCIR']}

circPercent = pd.DataFrame(data=d)

circPercent['Other'] = 1 - circPercent.sum(axis=1)

circPercent.head()
df = circPercent

circPercent.head()

d = {'STABR': state_df['STABR'], 'KIDCIRCL_PER': state_df['KIDCIRCL']/state_df['TOTCIR'], 'ELMATCIR_PER': state_df['ELMATCIR']/state_df['TOTCIR'], 'PHYSCIR_PER': state_df['PHYSCIR']/state_df['TOTCIR']}

physPercent = pd.DataFrame(data=d)
physPercent.head()

chart = sns.barplot(x="STABR", y="ELMATCIR_PER", data=physPercent)
sns.set(rc={'figure.figsize':(15.7,5)})
chart.set_title('Percent of Material that is Electronic')
chart.set_ylabel('Percent Electronic')
chart.set_xlabel("State")
chart

<matplotlib.axes._subplots.AxesSubplot at 0x1c21273c50>

Now that we can see the percent of total electronic material, what's the specific breakdown of the circulated material?

x = circPercent['STABR']
y1 = circPercent['BKPER']
y2 = circPercent['EBOOKPER']
y3 = circPercent['AUDIOPER']
y4 = circPercent['VIDEOPER']
y5 = circPercent['Other']

# memo of sample number
snum = y1+y2+y3+y4+y5

# normalization
y1 = y1/snum*100.
y2 = y2/snum*100.
y3 = y3/snum*100.
y4 = y4/snum*100.
y5 = y5/snum*100.

plt.figure(figsize=(4,3))

# stack bars
plt.bar(x, y1, label='Books (%)')
plt.bar(x, y2 ,bottom=y1,label='eBooks (%)')
plt.bar(x, y3 ,bottom=y1+y2,label='Audio Books (%)')
plt.bar(x, y4 ,bottom=y1+y2+y3,label='Videos (%)')
plt.bar(x, y5 ,bottom=y1+y2+y3+y4,label='Unidentified (%)')

plt.ylim(0,100)

plt.legend(bbox_to_anchor=(1.01,0.5), loc='center left')
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)

This graph shows the rough distribution of media types but suffers from a high "unidentified" category as many of the libraries surveyed did not provide a breakdown of all items in circulation. We can remove this category to get a better look at the identified material but this graph must be treated with caution:

x = circPercent['STABR']
y1 = circPercent['BKPER']
y2 = circPercent['EBOOKPER']
y3 = circPercent['AUDIOPER']
y4 = circPercent['VIDEOPER']

snum = y1+y2+y3+y4

# normalization
y1 = y1/snum*100.
y2 = y2/snum*100.
y3 = y3/snum*100.
y4 = y4/snum*100.

plt.figure(figsize=(4,3))

# stack bars
plt.bar(x, y1, label='Books (%)')
plt.bar(x, y2 ,bottom=y1,label='eBooks (%)')
plt.bar(x, y3 ,bottom=y1+y2,label='Audio Books (%)')
plt.bar(x, y4 ,bottom=y1+y2+y3,label='Videos (%)')

plt.ylim(0,100)

plt.legend(bbox_to_anchor=(1.01,0.5), loc='center left')
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
plt.show()

d = {'STABR': state_df['STABR'], 'NonBookPercent': (1-(circPercent['BKPER']))*100, 'RegPer100': mapByReg["Per100"]}

RegUserWithBk = pd.DataFrame(data=d)


RegUserWithBk.head()

sns.set(style="darkgrid")

g = sns.jointplot(RegUserWithBk["RegPer100"],RegUserWithBk['NonBookPercent'], kind="reg",
                  xlim=(0, 100), ylim=(0, 100), color="m", height=7)
g.set_axis_labels('Registered Users per 100 Residents', 'Circulation Identified as Books (%)', fontsize=16)

<seaborn.axisgrid.JointGrid at 0x1c1f90d160>

What about the R-squared value?

import statsmodels.api as sm
x = RegUserWithBk["RegPer100"]
y = RegUserWithBk['NonBookPercent']

# Note the difference in argument order
model = sm.OLS(y, x).fit()

# Print out the statistics
model.summary()

0.928. That means 92.8% of the variation in the number of registered users can be correlated with the percent of the collection that are identified as traditional books. Again, we need to make sure to take this with a grain of salt as the response rate for this question was not great.

Internet Accessibility¶

The next factor I wanted to consider is internet acessibility. This is often touted as the 'future' for public libraries. Internet access provides a way for citizens to gain access to the internet for self improvement, job hunting or homework after school.

compUse = {'SessionsPer100': state_df["PITUSR"]/state_df["POPU_LSA"]}
compUse = pd.DataFrame(data=compUse)
compUse['STABR'] = state_df["STABR"]
compUse.head()

sns.set(style="darkgrid")
funding_per_per = sns.barplot(x=compUse["STABR"], y=compUse['SessionsPer100'])
sns.set(rc={'figure.figsize':(15.7,5)})
funding_per_per.set_title("Computer Uses")
funding_per_per.set(xlabel='State', ylabel='Public Internet Computer Uses Per Year')
plt.show(funding_per_per)

m = folium.Map(location=[37, -102], zoom_start=4)

m.choropleth(
 geo_data=state_geo,
 name='choropleth',
 data=compUse,
 columns=['STABR', 'SessionsPer100'],
 key_on='feature.id',
 fill_color='BuGn',
 fill_opacity=0.7,
 line_opacity=0.2,
 legend_name='Computer Sessions Per Resident'
)
folium.LayerControl().add_to(m)
 
m

x = mapByReg["Per100"]
y = compUse['SessionsPer100']

# Note the difference in argument order
model = sm.OLS(y, x).fit()

# Print out the statistics
model.summary()

This provides one of the most correlated relationships thus far.

Funding¶

The last factor I wanted to look at was the one I suspected would have the greatest impact: funding. The data had funding broken out by local, state and federal.

funding = pd.concat([state_df['STABR'], state_df['FEDGVT'], state_df['STGVT'], state_df['LOCGVT'], (state_df['FEDGVT'] + state_df['LOCGVT'] + state_df['STGVT'])], axis=1)
funding.columns = ['STABR', 'FEDGVT', 'STGVT', 'LOCGVT', 'TOTAL']
funding['fund_per_person'] = funding['TOTAL']/ state_df['POPU_LSA']
funding.head()

sns.set(style="darkgrid")
funding_per_per = sns.barplot(x="STABR", y="fund_per_person", data=funding)
sns.set(rc={'figure.figsize':(15.7,5)})
funding_per_per.set_title("Funding")
funding_per_per.set(xlabel='State', ylabel='Funding Per Person')
plt.show(funding_per_per)

m = folium.Map(location=[37, -102], zoom_start=4)

m.choropleth(
 geo_data=state_geo,
 name='choropleth',
 data=funding,
 columns=['STABR', 'fund_per_person'],
 key_on='feature.id',
 fill_color='BuGn',
 fill_opacity=0.7,
 line_opacity=0.2,
 legend_name='Funding Per Resident'
)
folium.LayerControl().add_to(m)
 
m

x = funding['STABR']
y1 = funding['FEDGVT']
y2 = funding['STGVT']
y3 = funding['LOCGVT']

snum = y1+y2+y3

# normalization
y1 = y1/snum*100.
y2 = y2/snum*100.
y3 = y3/snum*100.

plt.figure(figsize=(4,3))

# stack bars
plt.bar(x, y3 ,label='Local Funding(%)')
plt.bar(x, y2 ,bottom=y3,label='State Funding(%)')
plt.bar(x, y1 ,bottom=y3+y2,label='Federal Funding(%)')

plt.ylim(0,100)

plt.legend(bbox_to_anchor=(1.01,0.5), loc='center left')
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)

This graph suprised me at first but made sense after thinking about it. The vast majority of a library's funding comes from local governments. The more the city pushes towards a library, the more funding it has. This rule is broken entirely by Hawaii which doesn't use local funding at all, instead relying on state dollars. Many of these libraries are also funded for specific initiatives instead of receiving blanket funding.

sns.set(style="darkgrid")

g = sns.jointplot(mapByReg["Per100"], funding['fund_per_person'], kind="reg",
                  xlim=(0, 100), ylim=(0, 100), color="m", height=7)
g.set_axis_labels('Registered Users per 100 Residents', 'Funding per Person', fontsize=16)

<seaborn.axisgrid.JointGrid at 0x1c247398d0>

x = mapByReg["Per100"]
y = funding['fund_per_person']

# Note the difference in argument order
model = sm.OLS(y, x).fit()

# Print out the statistics
model.summary()

The value for this regresssion is .887 or 88.7% of the variability explained. That's not too great. This suprised me as I was expecting funding to be connected with higher registration. Out of curiosity I mapped funding against the total number of visits per 100 citizens

sns.set(style="darkgrid")

g = sns.jointplot(mapByPop["Per100"], funding['fund_per_person'], kind="reg",
                  xlim=(0, 1000), ylim=(0, 100), color="m", height=7)
g.set_axis_labels('Visits per 100 people', 'Funding per Person', fontsize=16)

<seaborn.axisgrid.JointGrid at 0x1c24942320>

x = mapByPop["Per100"]
y = funding['fund_per_person']

# Note the difference in argument order
model = sm.OLS(y, x).fit()

# Print out the statistics
model.summary()

Wow- .952. I wonder why funding accounts for so much more variability in attendance than variability in registration.

My best guess is that well funded libraries attract guest back more frequently but don't impact the portion of citizens that come. Interesting.

It is also worth noting that funding for libraries is directly connected to tax revenue raised by municipalities. Thus, more wealthy areas almost always have libraries with my funding. This can be problematic as the populations most in need of libraries often don't have fully funded ones.

The Nitty Gritty¶

Since so much of the funding from libraries comes locally, this is where I began using the second sheet which had the information broken down by individual library. In order to increasing processing speed I pared this sheet down into just the columns I'm interested in. Let's see how much impact funding has per library.

indiv_df = pd.read_csv('smaller2016.csv')
indiv_df.head()

each_lib = {'State': indiv_df['STABR'], 'City': indiv_df['CITY'], 'Per100': (indiv_df['VISITS']/indiv_df['POPU_LSA'])*100, 'totalFunding': indiv_df['LOCGVT']+indiv_df['STGVT']+indiv_df['FEDGVT']}
each_lib = pd.DataFrame(data=each_lib)
each_lib['fund_per_person'] = each_lib['totalFunding']/indiv_df['POPU_LSA']

each_lib.head()

sns.set(style="darkgrid")


g = sns.jointplot(each_lib["Per100"], each_lib['fund_per_person'], kind="scatter",
                  xlim=(0, 10000), ylim=(0, 1000), color="m", height=7)
g.set_axis_labels('Visits per 100 people', 'Funding per Person', fontsize=16)

<seaborn.axisgrid.JointGrid at 0x1c27cdd5c0>

x = each_lib["Per100"]
y = each_lib['fund_per_person']

# Note the difference in argument order
model = sm.OLS(y, x).fit()

# Print out the statistics
model.summary()

Not as good as when we were looking at entire states. Curious about those outliers, I found the highest funded and most commonly visited states:

each_lib.nlargest(10, "Per100")

each_lib.nlargest(10, "fund_per_person")

Unless the city of Pontiac, MI really cares about their public libraries, I'm guessing there are some flaws in our data. Without more data I can't decide what to do about this, so I'm going to assume that we will just need to take this with a grain of salt.

Conclusion¶

So what does all of this mean?

In the analysis done above we can see that the aspects that are the most connected with library visitation are funding per person and computer usage. Thus, my recommendation for increasing library attendance would be to lobby for higher funding (surprise) and to push more funding into computer and wifi access.

Unfortunately, finding the level of impact libraries have on people's lives is much harder to measure, especially with the data provided. One other major aspects that could impact attendance and engagement are programming for teens and children which was not explored in this. We also cannot account for the errors in our dataset.

In the future I hope to continue working on this in the hopes of giving other children the headstart that my local library gave me.

	STABR	POPU_LSA	F_POPLSA	POPU_UND	POPU_ST	CENTLIB	F_CENLIB	BRANLIB	F_BRLIB	BKMOB	...	GPTERMS	F_GPTERM	PITUSR	F_PITUSR	WIFISESS	STARTDAT	ENDDATE	INCITSST	YR_SUB	OBEREG
0	AK	655885	R_16	649966	739828	88	R_16	16	R_16	1	...	1053	IF16	617259	IF16	705742	01/01/2015	06/30/2016	2	2017	8
1	AL	4849377	R_16	4849377	4849377	223	R_16	72	R_16	16	...	5062	IF16	3915279	IF16	5223893	10/01/2015	09/30/2016	1	2017	5
2	AR	2923845	R_16	2643928	2915918	54	R_16	181	R_16	3	...	2800	IF16	1712997	IF16	1078341	01/01/2016	12/31/2016	5	2017	5
4	AZ	10783977	R_16	6835518	6835518	85	R_16	136	R_16	11	...	7525	IF16	6625990	IF16	8489326	07/01/2015	06/30/2016	4	2017	6
5	CA	39239277	R_16	39239277	39255883	166	R_16	953	R_16	49	...	22832	IF16	29309523	IF16	18028138	07/01/2015	06/30/2016	6	2017	8

	STABR	POPU_LSA	F_POPLSA	POPU_UND	POPU_ST	CENTLIB	F_CENLIB	BRANLIB	F_BRLIB	BKMOB	...	GPTERMS	F_GPTERM	PITUSR	F_PITUSR	WIFISESS	STARTDAT	ENDDATE	INCITSST	YR_SUB	OBEREG
5	CA	39239277	R_16	39239277	39255883	166	R_16	953	R_16	49	...	22832	IF16	29309523	IF16	18028138	07/01/2015	06/30/2016	6	2017	8
36	NY	19529286	R_16	19378102	19378102	755	R_16	313	R_16	10	...	19424	R_16	18013306	R_16	15288377	04/01/2015	12/31/2016	36	2017	2
37	OH	11510467	R_16	11510467	11510467	239	R_16	475	R_16	56	...	13413	R_16	16002416	IF16	18598158	01/01/2016	12/31/2016	39	2017	3
10	FL	20206713	R_16	20106352	20450729	60	R_16	473	R_16	20	...	16874	IF16	15367156	IF16	14868341	10/01/2015	09/30/2016	12	2017	5
45	TX	25324740	IF16	25266651	27469114	556	R_16	323	R_16	11	...	20641	IF16	15217894	IF16	16557891	02/01/2015	12/31/2016	48	2017	6

	STABR	Per100
37	OH	643.933204
15	ID	616.981725
52	WY	603.437256
21	MA	601.596867
6	CO	590.075381
14	IA	576.166461
16	IL	574.738342
49	WA	546.964902
50	WI	546.929716
18	KS	546.370011

	STABR	Per100
37	OH	75.742696
13	HI	69.704031
18	KS	68.518165
31	NE	67.041174
25	MN	66.768929
14	IA	66.739458
6	CO	66.696852
34	NM	66.665474
15	ID	64.525159
52	WY	61.718423

Dep. Variable:	Per100	R-squared:	0.949
Model:	OLS	Adj. R-squared:	0.948
Method:	Least Squares	F-statistic:	905.2
Date:	Fri, 15 Mar 2019	Prob (F-statistic):	2.98e-33
Time:	16:48:52	Log-Likelihood:	-302.86
No. Observations:	50	AIC:	607.7
Df Residuals:	49	BIC:	609.6
Df Model:	1
Covariance Type:	nonrobust

	coef	std err	t	P>\|t\|	[0.025	0.975]
Per100	8.0895	0.269	30.086	0.000	7.549	8.630

Omnibus:	0.485	Durbin-Watson:	1.938
Prob(Omnibus):	0.785	Jarque-Bera (JB):	0.635
Skew:	-0.163	Prob(JB):	0.728
Kurtosis:	2.555	Cond. No.	1.00

	STABR	BKPER	EBOOKPER	AUDIOPER	VIDEOPER	Other
0	AK	0.457393	0.140770	0.113271	0.059675	0.228891
1	AL	0.455853	0.199255	0.066610	0.035556	0.242726
2	AR	0.435515	0.043316	0.031944	0.039202	0.450023
4	AZ	0.172226	0.054170	0.288394	0.024809	0.460400
5	CA	0.300766	0.051426	0.095558	0.027911	0.524339

	STABR	KIDCIRCL_PER	ELMATCIR_PER	PHYSCIR_PER
0	AK	0.325355	0.073734	0.970427
1	AL	0.324894	0.112982	0.887018
2	AR	0.311987	0.129488	0.870512
4	AZ	0.316838	0.132425	0.867565
5	CA	0.422199	0.087738	0.912273

	STABR	NonBookPercent	RegPer100
0	AK	54.260731	52.087485
1	AL	54.414746	56.661979
2	AR	56.448470	56.577897
4	AZ	82.777373	29.800379
5	CA	69.923393	57.415750

Dep. Variable:	NonBookPercent	R-squared:	0.928
Model:	OLS	Adj. R-squared:	0.927
Method:	Least Squares	F-statistic:	633.9
Date:	Fri, 15 Mar 2019	Prob (F-statistic):	1.09e-29
Time:	17:11:48	Log-Likelihood:	-211.91
No. Observations:	50	AIC:	425.8
Df Residuals:	49	BIC:	427.7
Df Model:	1
Covariance Type:	nonrobust

Omnibus:	1.284	Durbin-Watson:	1.961
Prob(Omnibus):	0.526	Jarque-Bera (JB):	0.540
Skew:	-0.020	Prob(JB):	0.764
Kurtosis:	3.507	Cond. No.	1.00

Dep. Variable:	SessionsPer100	R-squared:	0.941
Model:	OLS	Adj. R-squared:	0.939
Method:	Least Squares	F-statistic:	776.3
Date:	Fri, 15 Mar 2019	Prob (F-statistic):	1.05e-31
Time:	18:45:23	Log-Likelihood:	3.1755
No. Observations:	50	AIC:	-4.351
Df Residuals:	49	BIC:	-2.439
Df Model:	1
Covariance Type:	nonrobust

Omnibus:	0.066	Durbin-Watson:	2.501
Prob(Omnibus):	0.968	Jarque-Bera (JB):	0.019
Skew:	-0.001	Prob(JB):	0.990
Kurtosis:	2.903	Cond. No.	1.00

	STABR	FEDGVT	STGVT	LOCGVT	TOTAL	fund_per_person
0	AK	898716	1007391	34499883	36405990	55.506667
1	AL	1082131	3817533	92145508	97045172	20.011884
2	AR	9126	5150706	68935417	74095249	25.341716
4	AZ	1129144	1420461	175748761	178298366	16.533637
5	CA	5292521	14674047	1405950231	1425916799	36.339018

	SessionsPer100	STABR
0	0.941109	AK
1	0.807378	AL
2	0.585871	AR
4	0.614429	AZ
5	0.746944	CA

	STABR	FSCSKEY	LIBID	LIBNAME	ADDRESS	CITY	ZIP	CNTY	POPU_LSA	BRANLIB	LOCGVT	STGVT	FEDGVT	VISITS	BKVOL	EBOOK	AUDIO_PH	TOTCIR	TOTCOLL
0	AK	AK0001	AK0001-002	ANCHOR POINT PUBLIC LIBRARY	34020 NORTH FORK ROAD	ANCHOR POINT	99556	KENAI PENINSULA	2043	0	0	13784	0	6679	17453	0	97	13214	13195
1	AK	AK0002	AK0002-011	ANCHORAGE PUBLIC LIBRARY	3600 DENALI STREET	ANCHORAGE	99503	ANCHORAGE	299037	4	11861783	51592	192990	860751	367652	13694	37639	1831505	2040319
2	AK	AK0003	AK0003-002	ANDERSON COMMUNITY LIBRARY	101 FIRST STREET	ANDERSON	99744	DENALI	238	0	5000	6650	0	1145	15708	0	68	640	640
3	AK	AK0006	AK0006-002	KUSKOKWIM CONSORTIUM LIBRARY	420 CHIEF EDDIE HOFFMAN HIGHWAY	BETHEL	99559	BETHEL	6244	0	73600	6650	0	44775	33200	13694	317	11378	12669
4	AK	AK0007	AK0007-002	BIG LAKE PUBLIC LIBRARY	3140 SOUTH BIG LAKE ROAD	BIG LAKE	99652	MATANUSKA-SUSITNA	10259	0	350080	6650	3475	62283	26311	13696	2020	41094	41011

	State	City	Per100	totalFunding	fund_per_person
0	AK	ANCHOR POINT	326.921194	13784	6.746941
1	AK	ANCHORAGE	287.840970	12106365	40.484505
2	AK	ANDERSON	481.092437	11650	48.949580
3	AK	BETHEL	717.088405	80250	12.852338
4	AK	BIG LAKE	607.105956	360205	35.111122

	State	City	Per100	totalFunding	fund_per_person
4024	MI	PONTIAC	161110.526316	1328960	69945.263158
8295	TX	ROUND TOP	20145.161290	5150	55.376344
78	AK	LAKE MINCHUMINA	13818.181818	12865	1169.545455
3377	MA	PROVINCETOWN	10417.498314	319004	107.553608
8145	TX	GUTHRIE	9164.539007	90556	321.120567
5553	NJ	AVALON	7394.902549	2627676	1969.772114
4211	MN	GRAND MARAIS	7159.124088	260343	190.031387
3779	ME	MONHEGAN	6580.882353	500	7.352941
5721	NY	CHAUTAUQUA	6252.888889	34951	31.067556
733	CO	SILVERTON	6230.841121	112646	175.461059

Omnibus:	1.993	Durbin-Watson:	1.767
Prob(Omnibus):	0.369	Jarque-Bera (JB):	1.819
Skew:	0.453	Prob(JB):	0.403
Kurtosis:	2.771	Cond. No.	1.00

Omnibus:	1.497	Durbin-Watson:	1.983
Prob(Omnibus):	0.473	Jarque-Bera (JB):	1.491
Skew:	0.361	Prob(JB):	0.475
Kurtosis:	2.560	Cond. No.	1.00

Omnibus:	15538.379	Durbin-Watson:	1.265
Prob(Omnibus):	0.000	Jarque-Bera (JB):	312255376.796
Skew:	10.141	Prob(JB):	0.00
Kurtosis:	902.772	Cond. No.	1.00