a Healthier

METACINE is a tool for both patients and doctors to predict the probability of developing a condition based on the patients previous health records. Users can input patient information to run against our Machine Learning model to see possible conditions they are at risk for.

demo gif

How It Works

Training Data

The program utilized Synthea to generate the vast amount of training data needed to train the model. Synthea is an algorithm developed by MITRE that generates realistic, but artificial, Fast Healthcare Interoperability Resources, or FHIR data. FHIR, is a standard used to describe data in electronic healthcare records. It has become increasingly prevalent in the digital healthcare world as it easily allows machine to machine communication with a common framework.

Data Cleansing

The Synthea generated FHIR data contains an abundance of information that is not necessary for our training purposes. Our system implements an algorithm that parses the FHIR data, retrieves the individuals conditions, its duration as a percentage of the lifetime of the individual, and the current status of the condition. The system is modularized so the conditions used to predict can easily be changed.

Data Storage

The generated Synthea data is stored in a cloud FHIR data service. Providing our API and model access to the FHIR data via API endpoints. This will allow easy future access for our interactive Telemedicine system coming soon.

Model Generation

Using the cleansed FHIR data, an Artificial Neural Network is generated with TensorFlow. The network is composed of 2 hidden layers, each with 64 nodes and a ReLu activation function. For training the model a 0.6-0.2-0.2 train-validate-split ratio is used. The model is very accurate, with a final mean-squared error on the test data of 0.04.


The API is created with endpoints from our trained machine learning h5 model. The model uses a set of JSON formatted values for our data-frame and returns the value strength of conditions the model was trained for. Our endpoint consists of a single defined resource - "/predict" which uses error check preventing users to input values outside the limit for each variable and formats the result output in a nested array.



Used to return similar conditions.

URL : /predict

Method: POST

Auth required : NO

Data constraints:

  "data": {
    "user": {
      "Age": [valid integer between 0-100]
      "Gender": [0 for female, 1 for male],
      "condition[x]": [0 for false, 1 for true],
      "condition[x]_active": [0 for false, 1 for true],
      "condition[x]_duration": [between 0,1 for duration]

Data example:

  "data": {
    "user": {
      "Age": 50,
      "Gender": 1,
      "condition1": 1,
      "condition1_active": 0,
      "condition1_duration": 0.38,
      "condition2": 1,
      "condition2_active": 0,
      "condition2_duration": 0.75,
      "condition3": 1,
      "condition3_active": 0,
      "condition3_duration": 0.23,
      "condition4": 0,
      "condition4_active": 0,
      "condition4_duration": 0,
      "condition5": 1,
      "condition5_active": 1,
      "condition5_duration": 0.31,
      "condition6": 1,
      "condition6_active": 0,
      "condition6_duration": 0.9,
      "condition7": 1,
      "condition7_active": 0,
      "condition7_duration": 0.11

Success Response

Code: 200 OK

Content example:

  "errors": [],
  "id": "5e2d8df4-6544-419b-a7db-7d007ea6f93e",
  "value": [

Error Response

Condition : If any data constraint is outside the value range.

Code : 200 OK

  "errors": [
    "Out of bounds: condition7_active, has value of: 3, but should be between 0.0 and 1.0."
  "id": "2ce701e5-7331-462b-b5a2-dfb4c4d4f500"