Skip to main content

How to migrate external API data to the Amorphic dataset?

headerImage

info

Tidbits

  • External API connections are used to migrate data from an API endpoint to Amorphic's dataset.
  • Usually, these API endpoints are created by AWS API Gateway.
  • Only Basic authentication is supported.
  • For this workshop, Let's ingest from a publicly available country-wise COVID-19 data using covid-api.mmediagroup.fr/v1. This is the code running in AWS Lambda. More details at https://github.com/M-Media-Group/Covid-19-API.
  • This will fetch data in the JSON format.

Create a source connection

  • Click on 'Connections' widget on the home screen or click on INGESTION --> Connections on the left side navigation-bar or you may also click on Navigator on top right corner and search for Connections.
  • Click on a ➕ icon at the top right corner.
  • Enter the following details and click on Create Connection.
{
"Connection Name": "remote-api-2-amorphic-<your-userid>"
"Connection Type": "S3"
"Description": "Ingest from a publicly available country-wise `COVID-19` data using `covid-api.mmediagroup.fr/v1` to Amorphic. "
"Authorized Users": "Select your user name and any other user names you want to grant permission"
"Keywords": "Add relevant keywords like 'ext-api'. This will be useful for search"
"Version": "1.0"
"API Endpoint": "https://covid-api.mmediagroup.fr/v1/cases"
"API Authentication": "BASIC"
"Method": "GET"
}

Create Ext-api Connection

Create a target dataset

  • Click on 'DATASETS' --> 'Datasets' from left navigation-bar.
  • Click on ➕ icon at the top right corner.
  • Enter the following information and click on 'Register'.
{
"Dataset Name": "extapi_2_amd_ds_<your_userid>"
"Description": "This dataset is a destination for external API connection remote-api-2-amorphic-<your-userid>"
"Domain": "workshop(workshop)"
"Data Classifications":
"Keywords": "ext-api, covid-19"
"Connection Type": "External API"
"File Type": "Others"
"Target Location": "S3"
"Update Method": "Append"
"Connection": "remote-api-2-amorphic-<your-userid>"
"Enable Malware Detection": "No"
"Enable AI Services": "No"
"Enable Data Cleanup": "No"
}

Create Ext-api Connection

Setup a schedule

  • Click on 'SCHEDULES' from left navigation-bar.
  • Click on ➕ icon at the top right corner.
  • Enter the following information and click on 'Create'.
{
"Schedule Name": "extapi_2_ds_sched_<your_userid>"
"Description": "This schedule runs every 5 minutes to pull data from an external API to the Amorphic dataset."
"Type Of Job": "Data Ingestion"
"Select Dataset": "extapi_2_amd_ds_<your_userid> | ext-api" <-- Click ↩️ icon to refesh the list
"Keywords": "your_userid, ext-api"
"Allocated Capacity":
"Schedule Type": "Time Based"
"Schedule Expression": "rate(5 minutes)"
}

Create Ext-api Connection

Check data transfer

  • Execution Status tab of the schedule shows the status of executions as shown below.

Create Ext-api Connection

  • Hover on the message icon ✅ to check the job status.
  • For more details, click on 'three dots' and check output logs.
  • Check the files tab of dataset. The latest COVID-19 data in the JSON format is migrated here.

Create Ext-api Connection

Disable schedule

  • You don't want to keep running the schedule forever. This will reduce the load on the API.
  • Click on the Disable Schedule icon of the schedule page.
  • Click Yes.


You can do more...
  • Analyze JSON data in an ETL job to get insights from covid-19 data.