COVID-19 Public Data integration summary

Stitch’s COVID-19 Public Data integration was developed in a collaboration between Bytecode and Talend. It replicates data from multiple public data sources using the GitHub REST API v3. Refer to the Schema section for a list of objects available for replication.

COVID-19 Public Data feature snapshot

A high-level look at Stitch's COVID-19 Public Data (v1) integration, including release status, useful links, and the features supported in Stitch.

STITCH
Release status

Released on April 2, 2020

Supported by

Singer Community

Stitch plan

Standard

API availability

Available

Singer GitHub repository

singer-io/tap-covid-19

REPLICATION SETTINGS
Anchor Scheduling

Supported

Advanced Scheduling

Supported

Table-level reset

Unsupported

Configurable Replication Methods

Unsupported

DATA SELECTION
Table selection

Supported

Column selection

Supported

Select all

Supported

TRANSPARENCY
Extraction Logs

Supported

Loading Reports

Supported

Connecting COVID-19 Public Data

COVID-19 Public Data setup requirements

To set up COVID-19 Public Data in Stitch, you need:

  • A regular (free) GitHub account. The GitHub repo for this integration is public - no special access is required.


Step 1: Create a GitHub personal access token

  1. Sign into your GitHub account.
  2. Click the User menu (your icon) > Settings.
  3. Click Developer settings in the navigation on the left side of the page.
  4. Click Personal access tokens.
  5. On the Personal access tokens page, click the Generate new token button. If prompted, enter your password.
  6. In the Description field, enter stitch. This will allow you to easily identify what application is using the token.
  7. In the Select Scopes section, check the repo option:

    Highlighted repo scopes on the GitHub Personal Access Tokens page

    Note: While these are full permissions, Stitch will only ever read your data. The repo scope is required due to how GitHub structures permissions.

  8. Click the Generate token button.
  9. The new access token will display on the next page. Copy the token before navigating away from the page - GitHub won’t display it again.

Step 2: Add COVID-19 Public Data as a Stitch data source

  1. Sign into your Stitch account.
  2. On the Stitch Dashboard page, click the Add Integration button.

  3. Click the COVID-19 icon.

  4. Enter a name for the integration. This is the name that will display on the Stitch Dashboard for the integration; it’ll also be used to create the schema in your destination.

    For example, the name “Stitch COVID-19 Public Data” would create a schema called stitch_covid_19_public_data in the destination. Note: Schema names cannot be changed after you save the integration.

  5. In the GitHub Access Token field, paste the access token you created in Step 1.

Step 3: Define the historical replication start date

The Sync Historical Data setting defines the starting date for your COVID-19 Public Data integration. This means that data equal to or newer than this date will be replicated to your data warehouse.

Change this setting if you want to replicate data beyond COVID-19 Public Data’s default setting of 1 year. For a detailed look at historical replication jobs, check out the Syncing Historical SaaS Data guide.

Step 4: Create a replication schedule

In the Replication Frequency section, you’ll create the integration’s replication schedule. An integration’s replication schedule determines how often Stitch runs a replication job, and the time that job begins.

COVID-19 Public Data integrations support the following replication scheduling methods:

To keep your row usage low, consider setting the integration to replicate less frequently. See the Understanding and Reducing Your Row Usage guide for tips on reducing your usage.

Step 5: Set objects to replicate

The last step is to select the tables and columns you want to replicate. Learn about the available tables for this integration.

Note: If a replication job is currently in progress, new selections won’t be used until the next job starts.

For COVID-19 Public Data integrations, you can select:

  1. Individual tables and columns

  2. All tables and columns

Click the tabs to view instructions for each selection method.

  1. In the integration’s Tables to Replicate tab, locate a table you want to replicate.
  2. To track a table, click the checkbox next to the table’s name. A blue checkmark means the table is set to replicate.

  3. To track a column, click the checkbox next to the column’s name. A blue checkmark means the column is set to replicate.

  4. Repeat this process for all the tables and columns you want to replicate.
  5. When finished, click the Finalize Your Selections button at the bottom of the screen to save your selections.
  1. Click into the integration from the Stitch Dashboard page.
  2. Click the Tables to Replicate tab.

  3. In the list of tables, click the box next to the Table Names column.
  4. In the menu that displays, click Track all Tables and Fields:

    The Track all Tables and Fields menu in the Tables to Replicate tab

  5. Click the Finalize Your Selections button at the bottom of the page to save your data selections.

Initial and historical replication jobs

After you finish setting up COVID-19 Public Data, its Sync Status may show as Pending on either the Stitch Dashboard or in the Integration Details page.

For a new integration, a Pending status indicates that Stitch is in the process of scheduling the initial replication job for the integration. This may take some time to complete.

Free historical data loads

The first seven days of replication, beginning when data is first replicated, are free. Rows replicated from the new integration during this time won’t count towards your quota. Stitch offers this as a way of testing new integrations, measuring usage, and ensuring historical data volumes don’t quickly consume your quota.


COVID-19 Public Data table reference

c19_trk_us_daily

The c19_trk_us_daily table contains statistics for the United States, aggregated by day.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_daily schema on GitHub

__sdc_row_number

INTEGER

date

DATE

date_checked

DATE-TIME

death

INTEGER

death_increase

INTEGER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

hospitalized

INTEGER

hospitalized_increase

INTEGER

negative

INTEGER

negative_increase

INTEGER

pending

INTEGER

pos_neg

INTEGER

positive

INTEGER

positive_increase

INTEGER

states

INTEGER

total

INTEGER

total_test_results

INTEGER

total_test_results_increase

INTEGER

c19_trk_us_population_counties

The c19_trk_us_population_counties table contains statistics for reporting counties in the United States.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_population_counties schema on GitHub

__sdc_row_number

INTEGER

county

STRING

geo_id

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

pop_density

NUMBER

population

INTEGER

state

STRING

state_name

STRING

c19_trk_us_population_states

The c19_trk_us_population_states table contains population statistics for states in the United States.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_population_states schema on GitHub

__sdc_row_number

INTEGER

geo_id

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

pop_density

NUMBER

population

INTEGER

state

STRING

state_name

STRING

c19_trk_us_population_states_age_groups

The c19_trk_us_population_states_age_groups table contains population statistics in the United States, aggregated by age group and state.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_population_states_age_groups schema on GitHub

__sdc_row_number

INTEGER

agegroup

STRING

geo_id

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

pct_pop

NUMBER

population

INTEGER

state

STRING

state_name

STRING

c19_trk_us_states_acs_health_insurance

The c19_trk_us_states_acs_health_insurance table contains health insurance statistics for the United States.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_states_acs_health_insurance schema on GitHub

__sdc_row_number

INTEGER

acs_variable

STRING

age_group

STRING

concept

STRING

coverage_type

STRING

employed

STRING

estimate

NUMBER

estimate_type

STRING

geo_id

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

label

STRING

labor_force

STRING

margin_of_error

NUMBER

state

STRING

state_name

STRING

c19_trk_us_states_current

The c19_trk_us_states_current table contains current statistics for states in the United States.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_states_current schema on GitHub

__sdc_row_number

INTEGER

check_time_et

STRING

commercial_score

INTEGER

date_checked

DATE-TIME

date_modified

DATE-TIME

death

INTEGER

fips

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

grade

STRING

hospitalized

INTEGER

last_update_et

STRING

negative

INTEGER

negative_regular_score

INTEGER

notes

STRING

pending

INTEGER

positive

INTEGER

positive_score

INTEGER

score

INTEGER

state

STRING

total

INTEGER

total_test_results

INTEGER

c19_trk_us_states_daily

The c19_trk_us_states_daily table contains historical data for states in the United States, aggregated by day. Note: COVID-19 Public Data updates this data every day at 4PM ET.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_states_daily schema on GitHub

__sdc_row_number

INTEGER

date

DATE

date_checked

DATE-TIME

death

INTEGER

death_increase

INTEGER

fips

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

hospitalized

INTEGER

hospitalized_increase

INTEGER

negative

INTEGER

negative_increase

INTEGER

pending

INTEGER

positive

INTEGER

positive_increase

INTEGER

state

STRING

total

INTEGER

total_test_results

INTEGER

total_test_results_increase

INTEGER

c19_trk_us_states_info

The c19_trk_us_states_info table contains information about states in the United States.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_states_info schema on GitHub

__sdc_row_number

INTEGER

covid19_site

STRING

covid19_site_old

STRING

covid19_site_secondary

STRING

fips

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

name

STRING

notes

STRING

pui

STRING

pum

STRING

state

STRING

twitter

STRING

c19_trk_us_states_kff_hospital_beds

The c19_trk_us_states_kff_hospital_beds table contains statistics about hospital beds per 1,000 population, segmented by hospital ownership type.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

c19_trk_us_states_kff_hospital_beds schema on GitHub

__sdc_row_number

INTEGER

for_profit

NUMBER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

non_profit

NUMBER

state

STRING

state_local_government

NUMBER

state_name

STRING

total

NUMBER

eu_daily

The eu_daily table contains statistics for the European Union, aggregated by day. This data is sourced from the covid19-eu-data GitHub repository.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

eu_daily schema on GitHub

__sdc_row_number

INTEGER

cases

NUMBER

cases_100k_pop

NUMBER

cases_lower

INTEGER

cases_upper

INTEGER

country

STRING

date

DATE

datetime

DATE-TIME

deaths

NUMBER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

hospitalized

NUMBER

intensive_care

INTEGER

lau

STRING

nuts_1

STRING

nuts_2

STRING

nuts_3

STRING

percent

NUMBER

population

NUMBER

quarantine

INTEGER

recovered

INTEGER

tests

INTEGER

eu_ecdc_daily

The eu_ecdc_daily table contains statistics reported to the European Centre for Disease Prevention and Control, segmented by day.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

eu_ecdc_daily schema on GitHub

__sdc_row_number

INTEGER

cases

NUMBER

country

STRING

date

DATE

datetime

DATE-TIME

deaths

NUMBER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

italy_national_daily

The italy_national_daily table contains statistics for Italy, segmented by day.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

italy_national_daily schema on GitHub

__sdc_row_number

INTEGER

country

STRING

date

DATE

date_of_notification

DATE-TIME

datetime

DATE-TIME

deaths

INTEGER

discharged_recovered

INTEGER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

home_isolation

INTEGER

hospitalized_with_symptoms

INTEGER

intensive_care

INTEGER

new_currently_positive

INTEGER

note_en

STRING

note_it

STRING

tested

INTEGER

total_cases

INTEGER

total_currently_positive

INTEGER

total_hospitalized

INTEGER

italy_provincial_daily

The italy_provincial_daily table contains statistics for Italian provinces, segmented by day.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

italy_provincial_daily schema on GitHub

__sdc_row_number

INTEGER

country

STRING

date

DATE

date_of_notification

DATE-TIME

datetime

DATE-TIME

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

lat

NUMBER

long

NUMBER

note_en

STRING

note_it

STRING

province

STRING

province_abbr

STRING

province_code

STRING

region

STRING

region_code

STRING

total_cases

INTEGER

italy_regional_daily

The italy_regional_daily table contains statistics for Italian regions, segmented by day.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

italy_regional_daily schema on GitHub

__sdc_row_number

INTEGER

country

STRING

date

DATE

date_of_notification

DATE-TIME

datetime

DATE-TIME

deaths

INTEGER

discharged_recovered

INTEGER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

home_isolation

INTEGER

hospitalized_with_symptoms

INTEGER

intensive_care

INTEGER

lat

NUMBER

long

NUMBER

new_currently_positive

INTEGER

note_en

STRING

note_it

STRING

region

STRING

region_code

STRING

tested

INTEGER

total_cases

INTEGER

total_currently_positive

INTEGER

total_hospitalized

INTEGER

jh_csse_daily

The jh_csse_daily table contains data collected by Johns Hopkins CSSE.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

COVID-19 Public Data documentation

jh_csse_daily schema on GitHub

__sdc_row_number

INTEGER

active

INTEGER

admin_area

STRING

combined_key

STRING

confirmed

INTEGER

country_region

STRING

country_region_cleansed

STRING

date

DATE

datetime

DATE-TIME

deaths

INTEGER

fips

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

is_a_cruise

BOOLEAN

last_update

DATE-TIME

latitude

NUMBER

longitude

NUMBER

province_state

STRING

province_state_cleansed

STRING

recovered

INTEGER

neherlab_case_counts

The neherlab_case_counts table contains Neherlab Scenarios Data from Neherlab Biozentrum, Center for Computational Biology.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

neherlab_case_counts schema on GitHub

__sdc_row_number

INTEGER

cases

INTEGER

date

DATE

datetime

DATE-TIME

deaths

INTEGER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

hospitalized

INTEGER

icu

INTEGER

location

STRING

recovered

INTEGER

neherlab_country_codes

The neherlab_country_codes table contains Neherlab Scenarios Data from Neherlab Biozentrum, Center for Computational Biology.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

neherlab_country_codes schema on GitHub

__sdc_row_number

INTEGER

alpha_2

STRING

alpha_3

STRING

country_code

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

intermediate_region

STRING

intermediate_region_code

STRING

iso_3166_2

STRING

name

STRING

region

STRING

region_code

STRING

sub_region

STRING

sub_region_code

STRING

neherlab_icu_capacity

The neherlab_icu_capacity table contains Neherlab Scenarios Data from Neherlab Biozentrum, Center for Computational Biology.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

neherlab_icu_capacity schema on GitHub

__sdc_row_number

INTEGER

acute_care

INTEGER

acute_care_per_100k

INTEGER

country

STRING

critical_care

INTEGER

critical_care_per_100k

NUMBER

gdp

NUMBER

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

icu

INTEGER

imcu

INTEGER

percent_of_total

NUMBER

neherlab_population

The neherlab_population table contains Neherlab Scenarios Data from Neherlab Biozentrum, Center for Computational Biology.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Keys

__sdc_row_number

git_path

Replication Key

git_last_modified

Useful links

neherlab_population schema on GitHub

__sdc_row_number

INTEGER

country

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

hemisphere

STRING

hospital_beds

INTEGER

icu_beds

INTEGER

imports_per_day

NUMBER

name

STRING

population

STRING

suspected_cases_mar_1st

INTEGER

nytimes_us_counties

The nytimes_us_counties table contains data

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Key-based Incremental

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

nytimes_us_counties schema on GitHub

__sdc_row_number

INTEGER

cases

INTEGER

county

STRING

date

DATE

datetime

DATE-TIME

deaths

INTEGER

fips

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

state

STRING

state_code

STRING

nytimes_us_states

The nytimes_us_states table contains data collected by Johns Hopkins CSSE.

Note: The source file for this table is a single file that updates on a daily basis. When Stitch replicates this table, it will replicate the entire contents of the file, but only if the file has been modified since the integration’s last replication job.

Replication Method

Full Table

Primary Key

__sdc_row_number

Replication Key

git_last_modified

Useful links

nytimes_us_states schema on GitHub

__sdc_row_number

INTEGER

cases

INTEGER

date

DATE

datetime

DATE-TIME

deaths

INTEGER

fips

STRING

git_file_name

STRING

git_html_url

STRING

git_last_modified

DATE-TIME

git_owner

STRING

git_path

STRING

git_repository

STRING

git_sha

STRING

git_url

STRING

state

STRING

state_code

STRING


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.