Thursday, February 26, 2015

Medusa: A programming framework for crowd sensing applications

Analysis 

This paper discusses Medusa, a distributed sand boxed run time environment which provides abstractions and a programming language to specify, track and complete all the tasks in a crowd sensing scenario.
This paper is interesting because it clearly identifies and articulates the different stages and requirements or a work flow for a typical crowd sensing scenario described below.
Use Case
- I'd like use one of the examples in the paper to illustrate the contributions of the paper. Imagine the number of steps in a crowd sourcing scenario. 

S0 : A researcher wishes to obtain some video data for her analysis.
S1 : She would have to reach out to volunteers
S2 : Negotiate and possibly offer them money to provide her these videos taken at different places and times. 
S3 : She or another set of participants would then have to sift through all the submitted videos ( which may be some frames or the entire video) and select the relevant videos. 
S4 : She would have to request the full video from the volunteer and pay for their selected videos and services. 

The second stage involves the volunteers can also be broken down into smaller tasks
       T1: Volunteer signs up for crowd sourcing
       T2: Starts taking videos 
       T3: Uploads videos
       T4: Waits for selection stage ( see below ) and payment 

Medusa Design

The architecture consists of a partitioned runtime, a programming language and interpreter described in detail below. 

The MedScript Programming Language consists of two high level abstractions as labelled elements of an XML based domain specific language called stage and connectors.
  • Stage: Each stage can be one of two types, a sensing-processing-communciation (SPC) stage and a human-intelligence-task(HIT) stage. SPC stages include TakeVideo, ExtractSummary and Upload Video while examples of HIT stages include Recruit and Curate. 
  • Connector: Defines the control flow from one stage to another and handles failures. The special fork-join construct is available enabling concurrent execution of stages
  • Data model: Medusa provides support for namespaces and standard sensor data types.
  • User Mediation will have to be during Recruit, uploadData, takeVideo and annotate.
  • Failure Semantics include task abortion or retry on failure as dictated by the requester.
The Cloud Runtime consists of
  • medScript Interpreter which performs compile time checks on the submitted task decriptions and stores the intermediate results in a StageStateTable, a persistent store  
  • Task Tracker is a unique instance per submitted task. It spawns worker instances and coordinates the stage execution for all instances of this task maintaining state information in the stage state table.
  • Worker Manager, Stage Library
The Smartphone runtime consists of
  • the stage tracker which supports multiple concurrent task instances 
  • the MedBox a sandboxed runtime  which disallows APIs which access file system or do dynamic API binding.
TradeOffs / Limitations 

1. Is the downloading of stage libraries/uploading of videos done on WIFI or 3G? What is the size of data being sent over every stage? There is no mention of this in the paper. Even the limited resource usage section talks about time and not amount of data being sent.

For e.g., Everytime a stage like uploaddata is initiated on the smartphone, it downloads the stage libraries whose size is unknown ( has to be really big since purging is done on the phone regularly ) No evaluation was shown on this. Recruits cannot know the amount of bandwidth they'd need to use making it limited to users with fixed bandwidth plans.

They could possibly make these stages user mediated as well and do data heavy duties only when there is WiFi. However, there is always the trade-off with usability when many tasks are user initiated.

2. They have done a reasonable job in minimizing the mobile intervention by intercepting SMSes. If it was a pulling/polling service, it would have been a resource hog. 

However, they haven't discussed the impact on the phone's usability of the messaging application due to this SMS stage tracker. 
Does it parse every incoming message? or does it parse the locally stored messages at a predefined time? 

3. Medusa has a good policy to Re-executing failures/timeouts depending on the control flow specified by the requester. If the user may decide to delete all his messages before it is processed or if, would it force the task tracker to resend messages continuously. 

4. Similarly, how much of the RAM on the smart phone does Medusa end up using ? They've discussed the numbers in time again. It may substantially slow down other applications. 

5. The authors suggest that privacy could be a secondary concern if the volunteers were offered the right incentives. I do not agree with this statement. I'd like to have some guarantees on privacy even if I was receiving money for my services. But they do provide for a simple user opt-in by reading the privacy policy of the requester which is better than nothing. 

6. I think it was a good idea to allow local processing on sensor data by Worker/User mediation. This guaranteed better quality of submissions to the requester. 

However, the paper "No One size fits all " spoke of different metrics like compliance, temporal relationships, popularity etc. 

There was no such metrics considered in this paper.  They could include these metrics and tag these popular volunteers or rank them based on a point system. This could be a bigger incentive than money.

No comments:

Post a Comment