Crowd Science

NebulaWatch a mob in any news or sports footage and they’ll seem anything but smart. Today technology goes deeper, plucking the individual gems of brilliance that live in all of us and creating combined efforts on a scale we’ve never seen before…

In 1907, English explorer, geographer, inventor, meteorologist, geneticist and statistician Francis Galton published a paper about a county fair where a crowd of onlookers was trying to guess the weight of an ox. Although nobody — including professional ranchers — guessed the correct weight, Galton was intrigued to discover the average figure from the combined guesses was correct. New Yorker journalist James Surowiecki tells the story in his 2004 book The Wisdom of Crowds to introduce his point. We all know (or can do) a little bit, all we need is the system to put all those little bits together and we’ll have a much more detailed picture.

The term crowdsourcing wasn’t coined until the web age, undoubtedly because it enabled the collecting, aggregating and processing power we need to let crowdsourcing take full flight. Today it’s used for everything from judging the most attractive female posterior to gathering the funds to make movies.

Crowdsourcing has also found a unique place in science, as labs everywhere are realising they can use the collective eyes, ears, opinions and even gadgets owned by science enthusiasts around the world to contribute to research. Some examples like Galaxy Zoo are completely immersive (more below), still others are set-and-forget, like Stanford University’s Quake Catcher Network. Volunteers are sent an accelerometer that connects to a USB socket on their PC and picks up local tremors which the software reports back. From there it’s a short leap to a smartphone app (most modern devices have accelerometers built in), the next stage currently in planning.

At least one example of digital crowdsourcing predates the term. Since May 1999 California’s Berkeley University has overseen SETI@Home, the screensaver application that crunches data from radio telescopes looking for errant signals that might lead us to extraterrestrial life.

David Anderson is a research scientist with Berkeley’s space sciences laboratory, an adjunct professor of the University of Houston’s computer science department and the director of SETI@Home and in 1994 he was a specialist in distributed computing in 1994 when scientist David Gedye contacted him to talk about a distributed computing project. SETI@Home was created at the perfect time — the home PC explosion was in full swing and rather than spend millions on expensive supercomputers that would be obsolete in no time, SETI@Home sent data to the four winds, dormant processing power on two million systems (so far) doing the work for free.

“David was inspired in part by the 25th anniversary of the Apollo project, which got the American public excited and involved,” Anderson says. “He was trying to think of something that would have a similar effect. People like it for three reasons. The first is support for the scientific goal of finding ET. Second is the competition of having the fastest PC on the block. Third is the community, interacting with other users via message boards, etc.”

The average duration for a SETI@Home user is four months, but a quick look at the Top Participants page on the website reveals that many have been running it since day one. The user base ‘pretty much mirrors PC ownership,’ as Anderson says. “It’s 50 percent American, the remainder largely European. Asia’s underrepresented.”

The other meme that probably drove SETI@Home uptake was the 1997 film Contact with Jodie Foster and Mathew McConaughey, which was a huge success and put SETI on the cultural consciousness map. Suddenly here was a chance to join the ground-breaking, renegade and profound search depicted in the movie right on your own computer. “The writing of Carl Sagan really conveyed the idea that radio SETI is a scientific project, not a fantasy,” Anderson adds.

And like the contact depicted in the film, it’ll be a momentous event if/when it happens. The closest we’ve ever come is with the famous ‘Wow’ signal noticed by radioastronomer Jerry Ehman on August 15, 1977, a 72 second narrowband radio signal that appeared to come from outside our solar system but was never detected again. The bad news for SETI@Home users is that even if their version of the software finds a hello message from a far off civilisation, there’s no ‘congratulations, you’ve found ET!’ dialogue box.

“We get false alarms or exciting things to check very seldom,” Anderson admits, “and there’s a second stage of processing we do on our own computers. It finds a lot of potential signals but they’ve all been man-made interference so far. A ‘Eureka’ moment would only happen after we observe the same frequency and the same point in the sky again and see that it’s still there, but that’s pretty infrequent.”

Of course, crowdsourcing isn’t just about lending your computer to large-scale data investigation. One of the founders of Galaxy Zoo even said he preferred the term ‘citizen science’. “Crowd sourcing sounds a bit like [you’re just] a member of the crowd and you’re not, you’re our collaborator. Galaxy Zoo volunteers [aren’t] just passively running something on their computer and hoping that they’ll be the first person to find aliens. They have a stake in science that comes out of it, which means that they are now interested in what we do with it, and what we find.’

The original Galaxy Zoo project comprised images of a million objects taken from the Sloan Digital Sky Survey (with backers including NASA, the US Department of Energy and a host of science foundations around the world), later iterations using Hubble photography. Volunteers would download and inspect images of galaxies, identifying them by type to build up a catalogue. “One grad student managed to classify 50,000 galaxies in a week,” says Chris Lintott, the University of Oxford astronomer who helps run the project. “If he’d kept up that pace up for 76 years he’d have matched our classification total right now.”

Lintott points out that even if the tireless student had done so, the results still wouldn’t be as accurate as those of the crowd. Like Francis Galton observed, our collective efforts are smarter than even the most skilled individual. “One of the key features of Galaxy Zoo is that we obtain multiple classifications for each image, which gives us extra information,” Lintott says. “It also means that collectively mistakes aren’t made, so the results are actually more useful than if we’d done them ourselves.”

Inspired by a project called StardustHome (“I remember thinking ‘If people will look at dust grains, then surely they’d look at galaxies!'” Lintott remembers), Galaxy Zoo has similar patterns of use to SETI@Home. Lintott says most people ‘come and play for one or two visits’ while others become very dedicated and classify literally hundreds of thousands of images. But he still welcomes the contribution of people who only have a few minutes to spare. “The task of classification doesn’t need any astronomical knowledge,” Lintott says, “you don’t even need to know what a galaxy is. It relies on our ability for pattern recognition.”

And while SETI@Home has yet to produce a signal from an alien civilisation, volunteers contributing to Galaxy Zoo have already turned up new sights science has never seen before. One is Hanny’s Voorwerp object, a curious splodge just below a spiral galaxy. Found by Galaxy Zoo volunteer and Dutch schoolteacher Hanny van Arkel, the object still has researchers stumped.

To Lintott, that’s all part of the charm. “There’s a real desire [among Galaxy Zoo users] to do something that makes a contribution to science. It’s amazing to think that by sitting in front of a web browser and clicking a few buttons you’re actually adding to our store of information about the universe. As people spend more time in front of their computers, I think they want to do more than just reach the next level on Angry Birds, and our projects like Galaxy Zoo provide that. There’s also the chance to make a remarkable discovery, to see something that literally nobody has ever seen before.

Of course, you can do a lot more to contribute to science than sit on your computer, whether it’s running a screensaver or classifying shapes. David Brodrick is a 32-year-old computer programmer at the CSIRO’s Australia Telescope in Narrabri and joining the NSW chapter of the Bureau of Meteorology’s Storm Spotters was a natural progression. “Growing up in a country town makes you very aware of the weather,” Brodrick says. “When I lived in the city [Sydney and Brisbane] the only tangible impact of the weather was deciding about whether to take a jumper or umbrella out but in the country, entire communities thrive or flounder depending on the seasons.

“My serious interest in studying the weather happened after a particularly severe storm hit Narrabri on the 20th of January 2005. We had a microburst, which I’d never heard of at the time. It’s a powerful downdraft of cold air that spreads out at the ground. This one caused winds in excess of 150km/h and left widespread damage in Narrabri. There was a tornado south of us near Coonabarabran that also left a swath of damage.”

Brodrick and some friends attended a course four years ago given by Michael Logan, who manages the NSW Storm Spotters network at the Bureau of Meteorology. “We were interested in learning more about severe weather so attending the course gave us the opportunity to meet the pros. Also, I work at a large telescope array where we take the BOM’s weather warnings very seriously because when storms are around we need to stow the big dishes. I have a professional interest in making sure the storm warnings for our area are as accurate as possible.

“Plus, my friends and I would often go storm chasing within about a hundred kilometers of town. The BOM do a good job with radar and weather models, but you get an immediate impression of how a storm cell’s evolving when you can see it right in front of you.”

Brodrick has been instrumental in monitoring severe weather in his part of the world, a position and erstwhile responsibility that comes with its own thrills. “Severe thunderstorms can be enough to get the adrenal glands pumping,” he laughs, “but being able to phone it through adds an extra degree of excitement.”

It’s a relationship that works both ways. There’s a weather station in Narrabri but the weather is not only notoriously changeable, it’s variable over a comparatively small area. In a country almost 4,000km wide like ours the skies can hide a lot of nasties from human eyes until it’s too late, and spotters can become a critical resource.

“Most of the severe weather people at the Bureau take feedback from spotters very seriously and sometimes they’ve issued or updated storm warnings within a couple of minutes of me phoning them, even at 3am,” Brodrick says. “When you see comments in the warnings about ‘120km/h winds near Narrabri’ or ‘golf ball sized hail near Grafton’ that information usually comes from spotters.”

Brodrick’s passion for storm spotting and reporting has also created a resource that complements and adds to the resources of the Bureau. After the 2005 storm that fascinated he and his friends so much, they built a network of weather stations in the local area and started to share the reports. After reaching out to other weather spotters across the country the concept expanded and resulted in, a website that collects real time data for 15,000 locations. “The aim was to sustain the same concept that made Narrabri Weather useful to the community, and today Oz Forecast monitors data from several hundred weather stations as well as the full BOM network,” he says.

Technology has made inevitable improvements to the storm spotter’s craft, Brodrick calling the phone-based web browser the ‘epitome’ of how tech can help us as he and his friends and coworkers can check radar data from anywhere. But the visceral thrill in this sort of crowdsourcing is still being there rather than watching a screen. During the 2005 howler he recalls how his wife sent him into the backyard to retrieve a plastic washing basket. “Once I got outside I saw a big steel-framed greenhouse from the back paddock sailing through the air. I decided to leave the washing basket to meet its own fate and turned around to notice our garden shed was now in the neighbour’s yard! My wife forgave me for aborting my designated objective.”

Crowdsourcing 1.0

Just because the Internet has made it so much easier to wrangle the wisdom of the general public, doesn’t mean it’s never happened before.

One thing you always hear about technology is that it enables and in some cases exacerbates human behaviour, it doesn’t create it. And we’ve been a species dedicated to the sharing of information from the human cloud for a long time. Whatever your views on gender relations today, it’s biological fact women were designed to collect berries and roots near the camp and men were designed to pursue, kill and drag home meat. We needed protein as well as vitamins, and the crowd-sourcing of prehistoric tribes spread the load.

Closer to our own time, we’ve all seen the links in online news stories asking us to upload our pics or write our own accounts if we witnessed the incident being reported. Before the web, new photographers would show up on the scene and their job wasn’t just documenting the aftermath but asking around in case the witnesses had any of their own pictures of the action. And what else is the letters to the editor page but a virtual town square where the wisdom of the crowd (points of view) can be collected and disseminated?

Since the early 1990s the Bureau of Meteorology has tapped into an army of volunteer weather enthusiasts to report on severe storm activity around the country. Overhead satellite and ground-based radar can only do so much in a country our size, and the Bureau’s Severe Thunderstorm Warning Service calls Storm Spotters an ‘important component’ of what they do.

Volunteers watch for a specific criteria of conditions such as hail 2cm or more in diameter, winds that can snap large branches, tornadoes or local flash flooding. They can either ring a freecall number to report the conditions or fill in a card to mail to the Bureau later, giving a more complete picture of what happened where and adding to the expansive forecasting effort.

Small crowds

We don’t often think of our own family or household as a crowd, but a visionary project by MIT Media Lab Cognitive Machines Group director Deb Roy is delving into data recorded around his house to show us the wisdom of a very small crowd.

Roy set fish eye cameras and audio recorders up throughout his house and recorded enough footage for speech-to-text technology to extract over seven million words over the course of a year. When each instance was linked to the relevant point in the video footage, Roy was able to plot a four dimensional representation of his baby son learning to say the word ‘water’ — where in the house the word was said, the frequency of use, and the time he took to learn to say it properly.

The Birth of a Word project generated a map of 503 words Roy’s son learned in chronological order from the transcripts and in the case of the word ‘water’, he produced spatial data of where family members were in relation to mentions of the word. The result was a word landscape (see pic).

Roy’s team then applied their technique to larger datasets, moving on to TV signal content and the text of publicly available social media feeds. After pulling in about three billion comments a month, they cross-referenced what was on TV with what people were saying about it using semantic analysis. Suddenly the wordscape doesn’t just apply to Roy’s house, but the global online conversation about TV content, like a tag cloud you can see on any website (see pic). Graphs of social media and TV content are created and the individual links between and among them can be plumbed for very esoteric knowledge of what people are interested in, all in real time.

Profit sourcing

You need something done — maybe it’s the HTML of your new website, maybe it’s your tax return, maybe you just need someone to transcribe a recording you’ve made because you don’t type so well. Someone somewhere in the world is itching for the work, perfectly qualified and ready to start now — often for a price you won’t believe. The few dollars you wouldn’t think twice about paying for it might be the equivalent of a week’s salary somewhere in the world. In the old days you’d have to give your project to an expensive local provider and that trained-up worker thousands of miles away would stay hungry.

Welcome to the world of global outsourcing, where businesses like, and Amazon’s Mechanical Turk connect workers with employers on a scale international commerce has never seen before. Often bringing ultra-cheap but educated third world labour to western business, it’s the perfect way to inexpensively kick off the sort of business idea that would have required expensive preparation or R&D once upon a time.

“Absolutely, connects western businesses with freelancers primarily from developing world countries,” says Matt Barrie, CEO. “It’s a godsend to SMEs. You can literally have the spark of an idea and get it implemented at 4am for a few hundred dollars or so. And frankly, with 5 percent unemployment in Australia, we provide the much needed workforce that will help power this country into the digital economy.”