AI3 Prize Competition FAQ

Check below for frequently asked questions and the corresponding answers on the AI for IoT (AI3) Prize Competition. All answers and information is subject to change.

If you have other questions, please email [email protected] or consult the official rules document available here.

See official rules document

Where do I register as a walk-on contestant for Phase 2?

Please register and submit your concept paper here. Be sure to review the official rules document prior to submitting.

How many members are allowed in a team?

There is no limit to the number of members on a team, but each team must have an official representative and an alternate representative. Refer to the Official Rules for specific details. Rules are subject to change.

Can international students currently studying in university in the US on F1 visa take part in the competition?

Students studying on an F1 visa can participate if they are not citizens of a sanctioned country.
The official representative and the alternate representative must be U.S. citizens or permanent residents. This is not a requirement for other team members.

Can someone under the age 18 participate?

A person under 18 can participate in the challenge but cannot act as the official representative or the alternate representative. The official representative will receive any cash prizes on behalf of the team and is responsible for distributing prize money amongst the team members. Refer to the Official Rules for specific details. Rules are subject to change.

Are all team members 18 and above are eligible for cash prize?

The official representative will receive any cash prizes on behalf of the team and is responsible for distributing prize money amongst the team members.

What if a teammate graduates? Since this is a long competition, can they still participate while working full time?

This challenge is not restricted to students. Even if they graduate, a teammate can still participate and be eligible for prizes.

The competition uses data in different formats. Can you please elaborate more on that?

To emphasize the many different ways that IoT data can be consumed, the competition will demonstrate a variety of formats. For the testing state, all data will use well-established widely used data formats. For unknown data sources, we want to test the use of data sources and the integration of different data formats. Possible data formats include JSON files, spreadsheet data, etc. The data format itself is not as important as working with it in the process and should not be a source of concern.

Is the expectation that sample data will reside on the database being built?

A virtual sensor platform will be created to host the sample data. Additional details will be provided during the competition.

The earthquake scenario included in the rules document states that “while all three cities have personal area monitoring for their responders, their systems may differ.” What are these personal area monitors?

Those are portable and/or fixed sensors working together to provide information for the area of interest.

Should our team incorporate the scenario of an earthquake in American City X in our concept paper?

The earthquake scenario provides a common playground for all the contestants. Your solution should focus on the scenario, which, as described in the document, has cascading effects, such as electricity outages, main water pipe breakages, flooding, telecommunication problems, and active shooting.

What are we expected to show up to Disaster City with?

You should have everything ready before your show up at the Disaster City where you will be getting the streaming data from physical sensors. You can bring your own laptop or deploy your solution on a server or a cloud computing platform. Whatever your solution is, you are supposed to INGEST the streaming data from the physical sensors, EVALUATE the data, CATEGORIZE the data, and finally present the results that, you believe, will improve situational awareness for first responders. All this should be done in real time, otherwise the information will likely lose its value in emergency scenarios.

What do you mean specifically by ingestion, evaluation, and categorization?

Ingestion: process of acquisition or absorbing data
Evaluation: preprocessing, exploration, and pre-assessment of data
Categorization: process of placing data into classes to help identify data

In the Phase 3 evaluation criteria, there is a section “Percentage of correctly classified unknown streaming data sources.” What would be considered ‘correctly classified’?

As stated in the official rules document, “the winning solution should be capable of recognizing existing sensor data elements and incorporating previously unknown data elements based on contextual analysis”. Namely, your team will need to propose and develop a solution to recognize unknown data based on contextual analysis and incorporate the data with the known datasets to provide useful insight for first responders. Being able to identify and categorize the unknown streaming data sets is an essential step before you can use them. In phase 3, the contestants will be given several unknown streaming data sets and asked to automatically categorize them based on models trained on known data sets. E.g., if your model labels the temperature data correctly instead of labeling it blood pressure data, it would be considered “correctly classified.”

Will the only expected ML component come from the ingestion, evaluation, and categorization of data from multiple IoT devices?

The ML component(s) could be at various stages of the pipeline developed to handle multiple IoT devices. We encourage the contestants to be creative in the ways of using ML models.

If the input to the ML model is the sensor data, and its metadata, provided by the competition organizers, what are the expected outputs or labels in relation to ML training and classifying?

The labels will be the categories of the sensor data.

Under the guidelines provided, would a multimodal machine learning algorithm that fuses complementary information from multiple sensors be an appropriate ML algorithm to use in this competition?

You are free to use any ML algorithms. We leave the decision to our contestants as part of the challenge. “All models are wrong, but some are useful.”

Will the contestants be responsible for gathering or generating the data to train the ML model(s)?

The organizers will provide data for training and validating ML models in phase II and phase III. For Phase I-Concept Paper, no data sets are needed.

Can we use external data sets?

The contestants are allowed to use external data sets for the development and testing of their models, but all the models will be evaluated with the datasets provided by the organizers.

What does unknown data, unknown data source, and streaming data mean? How are these components ‘unknown’?

The unknown data are the unlabeled/unidentified data. The unknown data will be provided as “streaming data” through an MQTT server, Apache Kafka, and/or other widely used streaming services.

Will the provided training data be labeled with the type of sensor the data originates from and normal/abnormal readings?

Metadata information for the training datasets will be provided. There will not be labels for anomalies in the data set we are providing.

What other information will be given in addition to sensor data/type of sensor data?

The metadata will contain sensor information as well as the label for the data (e.g., temperature, wind speed, etc.). For unknown streaming data, the contestants are supposed to identify/categorize the unknown streaming data, prepare/process the data, and make it useful for first responders.

What is the visualization requirement?

Data visualization or some sort of UI component will be part of the final presentation of the deliverables. Although we do not limit how the results will be presented, how well the results are presented will definitely be affecting the scores from the judges.

Will contestants also need to create data visualizations or some sort of UI component with the output data?

Data visualization or some sort of UI component will be part of the final presentation of the deliverables. Although we do not limit how the results will be presented, how well the results are presented will affect the scoring.

Questions or interested?