From "Crowdsourced Enumeration Queries" by Beth Trushkowsky, Tim Kraska, Michael J. Franklin, Purnamrita Sarkar. ICDE 2013 best paper award, and one of my recent favorites.
Hybrid human/computer database systems promise
to greatly expand the usefulness of query processing by incorpo-
rating the crowd for data gathering and other tasks. Such systems
raise many implementation questions. Perhaps the most funda-
mental question is that the closed world assumption underlying
relational query semantics does not hold in such systems. As a
consequence the meaning of even simple queries can be called
into question. Furthermore, query progress monitoring becomes
difficult due to non-uniformities in the arrival of crowdsourced
data and peculiarities of how people work in crowdsourcing
systems. To address these issues, we develop statistical tools
that enable users and systems developers to reason about query
completeness. These tools can also help drive query execution
and crowdsourcing strategies. We evaluate our techniques using
experiments on a popular crowdsourcing platform
I've been playing with crowdsourcing function evaluation, and the above line of work shines: different types of human queries suggest different types of semantics. For example, Select all states in the US
makes sense, while select all ice cream flavors
has, arguably, a quantification error. The differences lead to fun stuff, such as distinct query plan optimizations for different human computations. I've found this style of thinking to guide my own recent implementation work.
The overall research field intersects many good topics: linguistics / NLP, query planning, language design, etc.
Pdf is here.
Recent comments
27 weeks 4 days ago
27 weeks 4 days ago
27 weeks 4 days ago
49 weeks 5 days ago
1 year 1 week ago
1 year 3 weeks ago
1 year 3 weeks ago
1 year 6 weeks ago
1 year 10 weeks ago
1 year 10 weeks ago