NIS data: work with the data array
The healthcare industry is interested in obtaining more data on the effects of marketed medicines in real clinical practice, which leads to the annually growing popularity of non-interventional studies (NIS), i.e. “non-interventional” studies. Advances in information technology in the field of clinical trials (clinical trials) are inexorably followed by an increase in the volume of information collected in non-interventional trials, which, in turn, leads to the increased complexity of data collection and analysis during trials, and therefore the need to expand the competencies of specialists working with clinical trial data.
The development of a correct and understandable structure for data collection is extremely important in this kind of research, as described in more detail in our last publication NIS DATA: APPROACHES TO ORGANIZATION OF DATA COLLECTION. However, all research, no matter how well designed and implemented, deals with data entry errors and their impact on research results. This problem and overcoming it is common in both experimental and observational studies. Therefore, when planning NIS data management processes, it is equally important to carefully develop an approach to dealing with data already obtained.
The data obtained in non-interventional studies often differ from data from classical clinical trials in their specificity:
- the data collected are often more complex and voluminous;
- there are differences in routine clinical practice between different research centers, as well as differences in the maintenance of primary documentation;
- as a rule, the duration of NIS significantly increases in comparison with classical clinical trials.
Also, NIS often minimizes the number of monitoring visits to the research centers and consequently reduces (or eliminates) the amount of reconciliation of the data entered into the IRC with the primary documentation. This places even more responsibility on the project team members involved in remote data validation and results in an even greater need for careful centralized monitoring by data management specialists.
There are various approaches for finding incorrect or questionable data that needs to be refined and corrected, such as:
Missing values.
This is probably the most basic method of working with acquired data. This approach is especially relevant when working with data from non-interventional studies, since the specifics of NIS suggest the possibility of a significant amount of missing information, especially given the possibility of retrospective collection of data on the treatment of patients.
However, it should be remembered that the database does not contain rules for showing co-dependent variables, accordingly when searching for missing values, the dynamic structure of eIRC forms must be taken into account.
Working with the logic of variable interaction.
Just as in the surrounding world data are always interconnected with each other, in the database variables have one or another logic of interaction and co-dependence. It is considered good practice to involve experts with expertise in the medical side of things to develop logical tests for data validation.
Statistical methods for finding outliers in the data.
In today’s data management field practice, a variety of statistical methods are widely used to handle a data set. Often, looking for outliers in the data can indicate the presence of outliers where standard validation checks on value ranges do not “detect” errors.
Graphical data handling methods.
While not the most popular data validation methods, graphs, and data visualizations are underappreciated tools in data validation that can be used not only to detect outliers in data but also patterns in the distribution of quantitative variables and cluster clusters of categorical variables.
Despite the potential of methods for working with CT data, there are currently no clear recommendations on methodologies for checking CT data, including non-interventional ones, and there are no uniform standards for assessing data quality, which is also emphasized in the GCDMP.
It is also worth noting that data cleaning in non-interventional studies is still a hotly debated topic and has long been viewed as a suspicious activity bordering on data manipulation. Concerns about where to draw the line between data manipulation and the need for data editing are well-founded. Therefore, it is critical not only that a plan for checking the database and making data refinement queries be carefully developed, but that it be reviewed and approved by other members of the project team, to minimize the possibility of unconsciously changing the specifics of the data through self-evident corrections or manipulating data entry through a query. It must be remembered that the goal of NIS is to collect real-world information that may differ from the expected data, but the need for this very kind of information is the origin of non-interventional studies.