Utilizing Social Determinants of Health (SDoH) to Provide Context for Your Clinical Research

Close-up of a person holding an elderly woman's shoulder
Blog  |  10 June 2022

Written by: Cheryl Reifsnyder, PhD

In recent years, researchers have become increasingly aware of the need for improved diversity in clinical research. Current U.S. research exhibits significant gaps in diversity. For instance,

  • 83% of U.S. clinical trial participants are White, while representing 67% of the U.S. population
  • 5% of participants are African American, while representing 13% of the U.S. population
  • 1% of participants are Hispanic, while representing 18% of the U.S. population

As you work toward improved diversity in your research, it’s important to recognize that diversity refers to more than race and ethnicity. Diversity also encompasses social and environmental factors that affect individuals’ health and well-being beyond the impact of good medical care. These factors are termed Social Determinants of Health (SDoH).

Healthy People 2030, a national health and well-being program, segments SDoH into 5 key areas:

  • Economic stability
  • Social and community context
  • Access to healthcare
  • Availability of education
  • Neighborhood and built environment

Research shows that these SDoH drive between 30% and 50% of health outcomes. Surprisingly, research also shows that SDoH can have a significant impact on clinical trial results.

Keep reading to learn how you can effectively identify reliable and relevant SDoH data to provide context for your clinical research.

Diversity in clinical research

Controlling for SDoH is essential when working to improve diversity in clinical research, as can have a major impact on study results.

However, it can also be extremely challenging to obtain SDoH data relevant to your specific research. While federal, state, and local efforts are helping to improve the integration of SDoH information into electronic health records (EHRs), there is a lack of consensus as to which SDoH measures are highest priority for capture. As a result, healthcare providers (HCPs) often fail to document important social risk factors in structured data fields. If HCPs do record social risk factors, it is often in unstructured fields that are not accessible to researchers.

Traditional sources of SDoH information

Researchers obtain SDoH data by mining data sources such as:

  • Insurance claims
  • Patient and disease registries
  • Health surveys
  • Commercial databases
  • Publicly available databases (e.g., the American Community Survey and the U.S. Census)
  • Centers for Medicare and Medicaid Services (CMS) data
  • EHR data

Not all data sources are equally reliable, and different data sources have different advantages and disadvantages. To ensure their data are trustworthy, researchers need:

  • A broad data set: Analyzing too narrow a data set may cause your study to include only certain population segments
  • A deep data set: To provide meaningful information about the study participants
  • Accurate, valid data: Data are only trustworthy when they originate from a trustworthy source

Health surveys

Data obtained from health surveys, for instance, may be less reliable than data from other sources. Respondents may choose not to answer challenging questions or may fail to remember key details, resulting in survey data of insufficient depth to provide study context.

Surveys can also have low response rates. This results in data based primarily on a subset of the population, lacking the breadth needed for diversity.

Surveillance and observational studies

In surveillance and observational studies, such as patient and disease registries, data come directly from diagnoses, lab results, and other patient records. These provide data that is more valid than data from health surveys because they come from the ongoing, systematic collection of information related to specific groups of patients.

Insurance claims

The reliability of claims data depends on the information being extracted. For instance, certain services are not covered by insurance, so are not included in claims data sets. Claims data also only cover care patients received; they do not cover care patients may have needed but did not receive. Claims data may lack longitudinal information about patients’ health, as claims data is limited to the specific timing of named events.

Claims data do not necessarily include SDoH information, and information that is included can be difficult to interpret because records from different care settings sometimes use different coding systems to identify medical procedures (e.g., ICD-10 versus Current Procedural Terminology [CPT] codes).

Electronic health records

EHR data contain detailed medical information, such as diagnoses, procedures, and test results. This data source tends to be more reliable than data from most other sources because it comes directly from health care providers’ notes and interactions with patients. However, even EHR data may be limited by the types of tests and procedures healthcare providers need to submit for reimbursement. Treatments, tests, and procedures that are unrelated to reimbursement are likely not regularly included.

One advantage of EHRs is that data in structured EHR fields tend to be categoric, numeric, or coded, allowing them to be captured, organized, and analyzed with relative ease. However, much of the SDoH data captured in EHRs is found in unstructured or semi-structured fields, making it more difficult to access.

How the Veradigm Network can improve SDoH context for your research

There are healthcare companies that can deliver real-world clinical data and there are healthcare companies that can provide insights into that data; however, Veradigm is in the unique position of being able to do both.

Veradigm has a broad point-of-care presence via its network of ambulatory EHRs—Allscripts Professional, and Practice Fusion. Access to multiple EHR systems enables Veradigm to provide researchers with access to data from a large, diverse patient population. The Veradigm Network encompasses about 20% of U.S. outpatient providers. Access to multiple EHR systems also means the Veradigm Network can offer greater diversity than any single EHR; for example, one EHR in the network is Practice Fusion, a cloud-based solution accessible to small, rural practices that would not be included in data from larger EHR systems.

Veradigm’s research database is nationally representative. It contains over 180 million patients, of which 45 million are linked to closed claims. Since data is extracted from EHRs within our network, its timeliness provides faster access to insights. Data from EHRs is also more longitudinal in nature, helping researchers to better understand the patient journey.

Veradigm also brings the ability to link Veradigm data with partners’ data sources to improve both data content and geographic representation. If Veradigm’s data do not provide the required patient representation, we can rely on our network partners for data to fill in those gaps.

Accessing and analyzing SDoH data can be especially challenging because this data is often stored in unstructured or semi-structured EHR fields. Veradigm’s expert team of data scientists are able to use our proprietary technology that allows data from unstructured fields to be accessed and analyzed more efficiently. They utilize Artificial Intelligence tools such as Natural Language Processing and Machine Learning to extract SDoH information from the point of care. This gives researchers access to unstructured data in these EHRs, where the bulk of SDoH data is recorded.

Contact us to learn how Veradigm can help you improve SDoH context for your clinical research or clinical trial.

Spread the word

Tags
Blog   Life Science   Clinical Research   Gaps in care  

Related insights