Monday, May 28, 2012

Zoos and Endangered Species -- trade-offs in everything

Take a look at this article by Leslie Kaufman for the New York Times, "Zoos’ Bitter Choice: To Save Some Species, Letting Others Die." An excerpt:
As the number of species at risk of extinction soars, zoos are increasingly being called upon to rescue and sustain animals, and not just for marquee breeds like pandas and rhinos but also for all manner of mammals, frogs, birds and insects whose populations are suddenly crashing.
To conserve animals effectively, however, zoo officials have concluded that they must winnow species in their care and devote more resources to a chosen few. The result is that zookeepers, usually animal lovers to the core, are increasingly being pressed into making cold calculations about which animals are the most crucial to save. Some days, the burden feels less like Noah building an ark and more like Schindler making a list.
A core dilemma is whether zoos should be more focused on entertainment, which people are more willing to pay for, or on preserving biological diversity for the "public good."

In some cases, it seems like advocates in the latter camp are more concerned with shaping public preferences than responding to them. The article continues:
Zoos are essentially given a menu of endangered species that the association is trying to maintain and can then choose according to their particular needs. But final decisions are often as much about heart as logic.
St. Louis, for example, has committed $20 million — or the equivalent of 40 percent of its annual operating budget — to building an enormous exhibit for polar bears — complete with a fake ice floe — even though its last polar bear died in 2009 and the Marine Mammal Protection Act makes it illegal to remove or rescue the bears from the wild. The zoo hopes that in the five years needed to open the exhibit, it can argue for an exemption, import orphaned bears from Canada or perhaps secure the cubs of captive bears.
Dr. Bonner acknowledges that the polar bear project runs counter to many of his more practical convictions on the role of the modern zoo. He has insisted that his keepers spend what limited field conservation dollars they raise on threatened animals that are most likely to make a comeback in the wild. With sea ice disappearing at an alarming rate, polar bears do not fit the profile.
But he justifies the exemption as a lesson for zoo visitors: “I want people to see this beautiful creature and ask, ‘How could we have let this happen?’ ”
Personally, if I went to a public zoo and saw nearly half of its operating budget spent on an object lesson in how difficult it is to preserve polar bears, collective guilt over habitat loss would not be the first thought to cross my mind.

There is a valid argument to be made for preserving biological diversity. Fear of a catastrophic breakdown, expressed through a variety of vivid analogies, is one of the more popular arguments, although perhaps one of the less valid. This "invisible threshold" argument has become a rationale for preserving species within even the most marginal ecological niches.
Several large buckets of dirt are now home to the threatened American burying beetle, so named because it buries the corpses of small animals, like birds and squirrels, and lays its eggs around them. Once, the beetles, with their brilliant red markings, ranged over 35 states. By the time the United States Fish and Wildlife Service listed them as endangered in 1989, there was one known population left, in Rhode Island.
At the government’s behest, the St. Louis Zoo, in conjunction with a zoo in Rhode Island, has been successfully breeding them and returning them to the wild.
Mr. Merz says the effort was worthwhile because the beetle might play an irreplaceable role in the ecological web. He considers picking species worth saving akin to life-or-death gambling. “It is like looking out the window of an airplane and seeing the rivets in the wing,” he said. “You can probably lose a few, but you don’t know how many, and you really don’t want to find out.”
One has to wonder, if burying beetles and partula snails are so crucial to the ecosystem, why are they only surviving in the back closet of a zoo? The burying beetle has functionally vanished from the North American ecosystem for over twenty years. Why haven't we seen any consequences yet?

Of course maybe this is just one more "rivet in the airplane's wing", the loss of which pushes us imperceptibly closer to global disaster. But, when you have to compare the costs of saving potentially millions of different endangered species, it helps to have an idea of the probabilities, rather than saying they are all equally unknown and potentially deadly.

What is the chance that any one particular species is completely irreplaceable? It is incumbent on the defenders of biodiversity to make these estimations, instead of demanding that every species must be saved, regardless of the cost. Even zoo-keepers can't live up to such an unattainable goal.

Friday, May 25, 2012

Twitter List Networks, Part 2: Spammers

As a point of comparison with my last post, here's another network of Twitter lists.

Instead of using my main account, this was built from a separate "follow-back" account I check on occasionally, which posts no genuine content whatsoever. All the accounts listing it are themselves follow-back bots or promotional accounts (lots of rappers and penny stock experts represented).

The "follow back" hairball.

I'm going to guess this is what happens when there are lots of accounts using some auto-listing utility. When compared to a network structure based on actual common interests instead of strategic Twitter-usage, the difference is striking.

Thursday, May 24, 2012

Are Twitter Lists another social network?

Twitter has a "List" feature, which allows users to organize their followers, or view tweets from only a select group. Until recently, the number of lists following each Twitter account was visible on that person's homepage, but that changed in the most recent overhaul of the Twitter interface.

Lists are now much less visible in the average Twitter user's experience. This leads me to wonder, do Twitter lists follow the pattern of other social networks? Did the change even make any difference? With NetworkX and some fiddling around on the Twitter API, I was able to answer that question.


To start, I made a queue of everyone who lists me on Twitter (77 in total). Then I had my script go to each of those accounts, collect all the lists they have, and add everyone they have in a list as an edge in my graph.

After trimming out accounts which are only in a single list (taking down the number of edges from the thousands to a few hundred), the result was this:

Dots are Twitter accounts, lines show which accounts have each other in lists.

Not very edifying. As you may be able to see, there are clusters (one on the left, one on the right) where two accounts list lots of the same people but aren't strongly connected with anyone else. After a little investigation, I found these accounts had very similar names and were very likely being run by the same person/program.

I took those out, and after another round of trimming, got a network that looked like this:

Enhanced.

So, it looks like there are two separate clusters of nodes, with a few tenuous connections in between. What can we say about the different members of this network?

It's hard to display visually here, but I looked at the names of accounts in the graph above. A clear topical split became apparent. On the right are promotional accounts, which are densely clustered and list each other frequently (I'm guessing many of these are automated).

On the left side of the graph are economists, news outlets, and political figures. Some economists appearing are: Tyler Cowen, Dambisa Moyo, Paul Krugman, and Bill Easterly; news outlets: The Economist, Washington Post, The Nation, and NY Post Opinion; as well as accounts for politicians such as Paul Ryan, Chris Christie, and Ron Paul.

Toward the center of the graph are "neutral" news outlets like TechCrunch, and the left-leaning (Huffington Post, Barack Obama). They are connected to the right-wing tweeters through mainstream news outlets like Roll Call and Rasmussen Poll, but have almost no direct connections with those accounts.

Zooming in on the "news and economics" segment (click to enlarge).
Since these are drawn from who listed me, the selection here says more about my Twitter account than the site more generally. But, I still find it interesting because almost none of the news/politics accounts above were in the original sample of accounts listing me. I'd infer that the ones shown represent common interests of people likely to list me.
For the curious: the full version with names.
More generally, this shows that lists on Twitter are segregated heavily by interests. This isn't surprising, as that is the list function's intended purpose, and it seems people are using it toward that end. 

Whether the interest is news and politics, or just gaining more followers, the pattern of listings on Twitter does seem to reflect a spontaneously organizing social network.

Diablo III and the Newsvendor Model

How does a long-awaited sequel, which became the fastest selling PC game of all time, still end up with a 2-star rating on Amazon? Probably because so many people were excited to play it and then couldn't, due to Blizzard's "always online" anti-piracy strategy combined with shaky server support.

Diablo III has made tons of money, but still turned into a PR nightmare for its parent company. From an economic perspective, however, these two things are not necessarily in opposition.

The newsvendor (or 'newsboy') problem, popular in the operations management literature, gives some insight into this apparent contradiction. It models a retailer who doesn't know exactly how much demand there will be for his/her product in the next period, and has to decide on inventory levels now. The vendor knows quantity demanded will be pulled from some statistical distribution, and wants to maximize expected profits.

This situation isn't too different from a video game company trying to decide how much to invest in server capacity. Blizzard doesn't know exactly how many people will buy the game on its release date, although they probably have some estimate (based on pre-purchases or past sales totals for their games, for example). They ideally want to have just enough server capacity to let everyone play, and no more. Given uncertainty, however, that goal is hard to accomplish.

The newsvendor model would advise a firm to purchase the average quantity demanded, assuming the costs of over- and under-purchase are exactly equal. For Diablo III, costs aren't exactly equal: once someone has bought, they won't be able to return the game if servers are overloaded -- at worst, maybe they tell friends not to buy it. But, if Blizzard over-purchases in server capacity, they're stuck with those costs.

In this case, over-purchase costs are higher than under-purchase costs, so it's rational for Blizzard to buy less than the average expected demand for their server capacity... Much to the chagrin of their loyal fans.

Consumers have a right to be annoyed, but these opening-day server issues shouldn't be much of a surprise. Counter-intuitively, if everyone could play without any interruptions at all, that outcome would probably be even more inefficient, at least from Blizzard's perspective.

Wednesday, May 23, 2012

Don't interact with strangers' children.

The way current law is set up, being a Good Samaritan and trying to rescue someone else's kid can only get you in trouble.

Browsing the Internet, I've found a few anecdotes which support this view. I don't have any verification that they're true, so you'll just have to take my (and their) word for it.

First story: a young woman is waiting at a street corner. She sees a mother, who is not paying attention, whose child wanders out into the street in front of an oncoming bus. The young woman jumps out and pulls the child back onto the sidewalk. Her reward? The mother yelling "how dare you touch my kid!!" and our would-be hero is treated as a villain, and forced to flee the scene.

Second story: a young man is on the beach. He observes a small male child falling off his surf board a long distance from land. The young man swims out and rescues the child from drowning. On returning to shore, he's greeted by an irate mother who calls the police and wants to press charges for child molestation. Luckily, witnesses confirm the man's story and the cops let him go.

Following this second anecdote a (self-proclaimed) lawyer comments, describing how this situation could have led directly to the young man being registered as a sex offender. By the time police would have questioned the child, his head would be full of misinformation from the angry mom, causing him to tell the police what they "want to hear", possibly putting the Samaritan behind bars or at least requiring a costly and life-disruptive legal defense.

Now, I'm not blaming either the moms in this situation (they are probably freaked out and will naturally accuse the first person they see who might be responsible for their child's endangerment) or the harsh treatment of sex offenders (children should obviously be protected from predators). But it's worth noting the incentive effects that these sort of stories have on potential Good Samaritans.

My personal stance is to never interact with a stranger's child no matter what the circumstances are. I won't engage in conversation, nod, smile, or hold a door open. I was about to say that the most proactive thing I'd do if I saw a child in danger would be to record the incident on video to give to YouTube the authorities later, but even taking pictures of kids can get a guy in trouble... So I probably wouldn't even do that.

Being a Good Samaritan is really a lose-lose proposition. If I succeeded in saving the child, best-case scenario I get a pat on the back, worst-case is a sex-crimes trial that will haunt me for the rest of my life. If I fail to save the child (it still falls under the bus) then maybe I get accused of murder or assault because the angry parent saw me "push" the kid instead of trying to rescue it!

There is absolutely no upside to helping or interacting with a stranger's kid. Perversely, this fact makes being a Good Samaritan far worse: because rational people know it's a bad idea to help a kid, the people who do try to help are even more likely to be creeps or labeled as such (the selection effect).

In its efforts to prevent strangers from harming vulnerable children, society has also unintentionally deterred strangers from assisting vulnerable children. It's hard to say which impact is more important, but given the relative magnitudes (there are lots more healthy, well-intentioned people out there than sex offenders) it's very possible the overall effect has been negative for child safety.

Tuesday, May 22, 2012

Promoted Accounts on Twitter, the Great Enigma

For a class project (CSS692/ECO895, Social Network Analysis) my group - Kevin May, Echo Keif and I - took on a project a almost bigger than we could chew: identifying astroturf on Twitter. It turned out to be more ambitious than we realized, but even starting with a low level of technical sophistication we were able to find some interesting results.

What is astroturf? While most social movements are said to resemble a "grassroots", sometimes wealthy organizations will attempt a "cashroots" strategy instead - paying for people to spread a pre-chosen message. This has been a problem since the dawn of democracy, but social media has given many more opportunities for astroturfing.

The Truthy Project is one attempt to track how online memes spread, and distinguish authentic movements from fabricated ones. However, there still isn't much agreement on what an astroturfer looks like, compared to a genuine grassroots movement.

We focused on Twitter for our project. The recently unveiled Promoted Accounts feature, used by Twitter to generate revenue, might uncharitably be described as a tool for astroturfing. Promoted Accounts are put at the top of the "Who To Follow" list shown to each Twitter user, but otherwise not tracked or recorded in a publicly accessible way. Our goal was to identify common characteristics of Promoted Twitter accounts, and thereby develop a profile of what an astroturfer might look like.

Methodology: using a script to interface with the Twitter API, we collected networks by picking a Promoted Account or someone listed as "Similar" to a promoted account. Then we created a graph out of everyone that account follows, and everyone that each of those friends follows. Given how unselective some people are in following others on Twitter, these graphs got big very quickly! Here's a sample, after trimming out accounts with less than 75 connections:

Mitt Romney's core Twitter network.
After collecting 150 such graphs, we starting looking at various network metrics. Here's a visual comparison of Promoted versus Similar accounts based on those metrics:

Radial graphs created by Echo.
("Normalized" means fitting all values into the interval [0,1]. This makes measures more comparable between large numbers; e.g. cliques were often measured in the millions).



At a glance, it looks like Promoted accounts have higher Closeness Centrality (roughly speaking, this reflects the relative importance of the account in terms of connections with others). Promoted accounts also tend to have less followers. This makes sense -- Bill Gates or Justin Bieber don't need to pay for promotion, because they already have millions of followers. It tends to be mid-size accounts which are promoted, and this is reflected in the numbers.

Finally, using multiple linear regression, we tried to see which attributes can predict whether an account is Promoted or not. Here is the output for several different specifications:


(1)
(2)
(2)
(4)
Following
2.55E-05
(0.000093)
0.000449
(0.000388)
4.04E-05
(0.00018)
1.02E-03
(5.65E-04)
Followers
-3.82E-08**
(1.11E-08)
-1.57E-08
(1.81E-08)
-2.81E-08**
(1.19E-08)
-1.11E-08
(1.47E-08)
Nodes after Trim
-1.55E-06
(3.12E-06)
5.04E-06
(8.38E-06)
8.21E-06
(9.63E-06)
1.44E-05
(1.82E-05)
Edges After Trim
-2.02E-07
(4.38E-07)
-1.60E-06
(1.35E-06)
-1.75E-06
(1.47E-06)
-3.66E-06
(3.04E-06)
Pendants
8.15E-07
(5.35E-07)
-3.12E-07
(1.45E-06)
-1.45E-07
(1.06E-06)
-1.60E-06
(2.31E-06)
Network Density
--
--
-0.06754
(0.197919)
0.264582
(0.474809)
Closeness
--
--
-0.29216
(0.688034)
8.57E-01
(8.93E-01)
Cliques
--
--
1.19E-08
(2.60E-08)
3.48E-08
(3.80E-08)
PageRank
--
--
0.372417
(0.591458)
-1.8429
(1.452847)
Constant
0.088058
(0.046366)
0.267441
(0.258445)
0.035358
(0.312639)
-0.09244
(0.580219)
Similar FE?
NO
YES
NO
YES
n
136
136
120
120

(95% significance level or above is shown by **. Standard errors are robust to heteroskedasticity). Nodes/edges after trim is the number after removing all accounts with only a single connection ("pendants").


This output isn't very satisfying, because almost none of our measures proved to be statistically significant. This may reflect the relatively small sample size, or just low levels of variation in the measures of interest. 

If you're interested in reading the final paper or seeing the script used to collect our data, you can find it here (majority of credit for writing goes to Echo Keif; Kevin and I were mostly involved in the data collection and statistics side).

This project was interesting because as far as I know, there is still very little information out there about Promoted accounts. Wild stab in the dark this might be, but since it's an early stab in the dark I think it still represents a contribution. Until Twitter makes info about Promoted accounts available via the API, broader efforts to understand who is promoted and what they gain from it will remain a very rough science.

Saturday, May 12, 2012

Fun with Twitter Metrics

Using tweepy I've been looking at the characteristics of my Twitter following. I found these histograms pretty interesting.

The first shows how many people my followers are following and followed by. (The x-axis is the relevant number, the y-axis shows how many incidences of that number of friends/followers occur).

Followers (Blue) and Following (Green).

Next is the number of status updates posted. Looks like lots of my followers haven't tweeted much at all! It's a dilemma: do I unfollow them for being inactive? But, because the inactives aren't tweeting, they aren't flooding my timeline with stuff I don't want to read, either... Twitter-vanity might make me keep them, just so that whatever bot is running those accounts doesn't unfollow me.

Status Count.

Finally, number of favorite tweets by user. Lots of people don't seem to use the "Favorite" function of Twitter at all. I probably have less than 10 tweets I've marked as "Favorite" (it seems like such a commitment). It's nice to be able to tag a link or something worth going back to later, so I'm glad Twitter has this feature... even though it is, apparently, hardly used.

Favorite Tweets.

Then there are a few accounts at the far right of the distribution with lots of favorites. What's going on here?

Generally the distributions resemble a power law, which is not surprising when looking at social networks.

Twitter metrics will be an ongoing project, so this is just the beginning. If you find this stuff interesting, check back in a few days.

Tuesday, May 8, 2012

Farm Subsidies: A Picture Worth 1,000 Words

Who ever said there's no skiing in Iowa? Source.
In other news, the Institute of Medicine is advising the government to adopt a series of policies to control the American "obesity epidemic." One such policy is a proposed "soda tax."

I have a better idea. Instead of taxing soda, why don't we repeal the subsidies which makes the primary ingredient, high-fructose corn syrup, so dirt cheap to produce?