The six scenarios and datasets referenced below each provide a significant dataset that represent a level 2 fusion problem. The simplest scenarios are from 2006 and 2007. Beginning in 2008, the range and complexity of the scenarios increased, and were divided into mini-challenges and grand challenges. Data sets included unstructured text, structured data, images, and video data.
A quick link to each is here: VAST 2006 VAST 2007 VAST 2008 VAST 2009 VAST 2010 VAST 2011
The 2012 basic challenge is at this link:
http://visweek.org/visweek/2012/info/call-participation/vast-challenge
More detail on each are provided below
The scenarios and datasets below are from the Visual Analytics Science and Technology (VAST) challenge, which is a participation category of the IEEE VAST Symposia. They have the purpose of pushing the forefront of visual analytics tools using benchmark data sets and establishing a forum to advance visual analytics evaluation methods. The objective of the forum is to speed the transfer of visual analytics technologies from research labs to commercial products, and to increase the availability of evaluation techniques.
1. VAST 2006 (Grinstein, G., O’Connell, T., Plaisant, C., Scholtz, J., Whiting, M., IEEE VAST 2006 Contest, The tale of Alderwood, www.cs.umd.edu/hcil/VASTcontest06 (2006))
SCENARIO: (Go Here for Scenario Details) In January 2003, the FBI is investigating possible political misbehavior in the fictitious mid-sized vacation town of Alderwood, located on the banks of the Alderwood River in south-central Washington State. Alderwood is suffering from a loss of tourism due to the early 2000s economic crash. In addition, agriculture is adversely affected by the discovery of bovine spongiform encephalitis (BSE, also known as ‘mad cow disease”), resulting in a beef export embargo. Yet, there is a sudden influx of young talented men and women relocating to Alderwood, with claims that there are some connections to the local government. The sources and reasons are not immediately obvious.
OBJECTIVE: Identify and describe what is happening in Alderwood.
DATASET OVERVIEW (All datasets are synthetic) – Go Here For VAST 2006 Dataset The dataset has the following types of data. File type shown in parenthesis
In addition, a second set of files are available in which the 1182 news story files have been preprocessed through an entity extraction routine (using the MITRE ALEMBIC tool. Unfortunately, the link to describe ALEMBIC is broken).
SCENARIO (Go Here for Scenario Details) : In the Fall of 2004, you are the analyst for an unnamed agency investigating some unexpected activities concerning wildlife law enforcement, endangered species issues, and ecoterrorism.
OBJECTIVE: Determine what is occurring, based on the data.
DATASET OVERVIEW (All datasets are synthetic) – Go Here For VAST 2007 Dataset The dataset has the following types of data. File type shown in parenthesis.
3. VAST 2008 Beginning in 2008, the VAST challenges changed from a single challenge to a set of mini-challenges, which were then combined into a grand challenge
SCENARIO AND OBJECTIVES (Go Here for Scenario Details): The fictional Caribbean island nation of Isla Del Sueño is experiencing a new religious movement, the Paraiso movement, which is causing controversy and political unrest. You’ve been asked to investigate certain aspects about this movement.
Overall – integrate all data to determine the social network of the Paraiso movement at the end of the time period, names can be associated with individual activities, the geographical range of the Paraiso Movement and how it changes over time, and how the major beliefs of the Paraiso movement affect their activities
DATASET OVERVIEW (All datasets are synthetic) – Go Here For VAST 2008 Dataset The dataset has the following types of data. File type shown in parenthesis
SCENARIO: (Go Here for Scenario Details) and Objectives: An embassy employee for the embassy in the county of Flovania is suspected of sending data to an outside criminal organization. Determine who the employee is, using movement data gathered from badge tracking, system network logs, social networking site data, and video surveillance camera data..
OBJECTIVE: Determine the scenario. Who are the major players in the scenario and what are their relationships?
DATASET OVERVIEW (All datasets are synthetic) Go Here For VAST 2009 Dataset:
SCENARIO: The scenario is divided into three mini-challenges, with a “grand challenge” to combine the results of the mini-challenges. Mini-Challenge 1 is about an illegal arms deals that involves several countries. Mini-Challenge 2 is about a pandemic outbreak of a virus across several cities in the world. Mini-Challenge 3 is to investigate the source of a virus strain taken from a victim of the pandemic. For the Grand Challenge, determine any possible linkage between the illegal arms dealing and the pandemic outbreak
OBJECTIVE: 1. Briefly describe your hypothesized linkage between the arms dealing activity and the pandemic outbreak.
2. Given the hospital and death records, characterize the e the spread of the disease, and determine any anomalies across countries.
3. Given some genetic data on a strain of the virus, determine the country of origin of the virus, and its mutations and resistances to treatment
4. We had countries with arms dealers identified in MC 1 that did not suffer pandemic outbreaks in MC 2. Provide a hypothesis as to why some countries that may have been involved with arms dealers did not suffer an outbreak?
DATASET OVERVIEW (All datasets are synthetic) Go Here For VAST 2010 Dataset:
SCENARIO: This scenario takes place in Vastopolis is a major metropolitan area with a population of approximately two million residents. The scenario is divided into three mini-challenges, with a grand challenge to integrate the results . Mini-Challenge 1 is to characterize an Epidemic Spread in Vastopolis. Mini-Challenge 2 looks at security issues in the computer networking operations at a freight company operating in Vastopolis (the All Freight Corporation). Mini-Challenge 3 is an investigation into terrorist activity in the Vastopolis metropolitan area. In the Grand Challenge, you are charged with investigating the cause of the epidemic, and determining any link to possible terrorist activity
OBJECTIVE:
DATASET OVERVIEW (All datasets are synthetic) Go Here For VAST 2011 Dataset
For Mini-Challenge 1, three structured databases (XLS) and a map (JPEG)
For Mini-challenge 2, five sets of computer system logs (structured text) across three days
For Mini-challenge 3, 4474 unstructured text files.
CG&A Journal paper about the VAST 2007 contest
CG&A Special Issue on Visual Analytics Evaluation. (published in April 2009)