Title: | Cleaning Validation Functions for Pharmaceutical Cleaning Process |
---|---|
Description: | Provides essential Cleaning Validation functions for complying with pharmaceutical cleaning process regulatory standards. The package includes non-parametric methods to analyze drug active-ingredient residue (DAR), cleaning agent residue (CAR), and microbial colonies (Mic) for non-Poisson distributions. Additionally, Poisson methods are provided for Mic analysis when Mic data follow a Poisson distribution. |
Authors: | Mohamed Chan [aut], Wendy Lou [aut], Xiande Yang [aut, cre] |
Maintainer: | Xiande Yang <[email protected]> |
License: | GPL-3 |
Version: | 1.0 |
Built: | 2025-02-13 03:57:34 UTC |
Source: | https://github.com/chandlerxiandeyang/cleaningvalidation |
This package offers a comprehensive suite of functions for cleaning validation, a critical component of quality control in pharmaceutical manufacturing. The included functions assist in analyzing residue data, evaluating cleaning efficacy, and ensuring that cleaning processes meet regulatory standards.
The functions primarily return data frames, streamlining data preprocessing, analysis, and the application of statistical methods for cleaning process evaluation. This toolset simplifies the workflow for cleaning validation professionals, providing resources for various tasks. Function cv01 cleans three data types. Functions cv02 to cv12 (excluding cv05) are designed for sequential DAR and CAR analysis. Functions cv13 and cv14 assess whether Mic follows a Poisson distribution. For Mic data that follows a Poisson distribution, function cv05 and functions cv15 to cv29 should be used in sequence. If Mic data do not follow a Poisson distribution, function cv05 and functions cv02 to cv12 (excluding cv06) are applicable. Function cv30 synthesizes the Process Performance Index (Ppu) for DAR, CAR, and Mic. Supplementary to its core capabilities, the package includes datasets—Eq_DAR, Eq_CAR, and Eq_Mic—for demonstrating functionality in practical contexts.
This package is free software; you may redistribute and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License or (at your option) any later version.
## Not run: # Example code here to demonstrate package usage: # This could include data loading, transforming, and cleaning validation analysis. ## End(Not run)
## Not run: # Example code here to demonstrate package usage: # This could include data loading, transforming, and cleaning validation analysis. ## End(Not run)
This function ensures data type and no missing data in residue_col, cleaning_event_col, usl_col of data their type. Furthermore, it changes cleaning_event_col to time ordered factor. It cleans and pre-processes the residue data for stability and capability analysis, ensuring that it meets the necessary criteria for analysis.
cv01_dfclean(data, residue_col, cleaning_event_col, usl_col)
cv01_dfclean(data, residue_col, cleaning_event_col, usl_col)
data |
A data frame containing one of drug active-ingredient residue (DAR), cleaning agent residue (CAR), or microbial bioburden residue (Mic) data. |
residue_col |
The name of the column containing the numeric residue data. |
cleaning_event_col |
The name of the column containing the Cleaning Event data. |
usl_col |
The name of the column containing the numeric upper specification limit (USL) data. |
A cleaned and pre-processed data frame such that all variables have no missing values, its CleaningEvent is time-ordered categorical variable, and Residue and USL are numeric.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assume Eq_DAR, Eq_CAR, and Eq_Mic are loaded datasets # Clean and preprocess residue data for Eq_DAR Eq_DAR <- cv01_dfclean(data = Eq_DAR, residue_col = "DAR", usl_col = "USL", cleaning_event_col = "CleaningEvent") # Clean and preprocess residue data for Eq_CAR Eq_CAR <- cv01_dfclean(data = Eq_CAR, residue_col = "CAR", usl_col = "USL", cleaning_event_col = "CleaningEvent") # Clean and preprocess residue data for Eq_Mic Eq_Mic <- cv01_dfclean(data = Eq_Mic, residue_col = "Mic", usl_col = "USL", cleaning_event_col = "CleaningEvent")
# Assume Eq_DAR, Eq_CAR, and Eq_Mic are loaded datasets # Clean and preprocess residue data for Eq_DAR Eq_DAR <- cv01_dfclean(data = Eq_DAR, residue_col = "DAR", usl_col = "USL", cleaning_event_col = "CleaningEvent") # Clean and preprocess residue data for Eq_CAR Eq_CAR <- cv01_dfclean(data = Eq_CAR, residue_col = "CAR", usl_col = "USL", cleaning_event_col = "CleaningEvent") # Clean and preprocess residue data for Eq_Mic Eq_Mic <- cv01_dfclean(data = Eq_Mic, residue_col = "Mic", usl_col = "USL", cleaning_event_col = "CleaningEvent")
This function processes three datasets to identify unique project IDs based on non-process-related out-of-specification (OOS) and reswab cases, then summarizes this information into a dataframe. If your data does not have reswab or OOS, you do not need to use this function.
cv02_nonpro_oos_reswab(Eq_DAR, Eq_CAR, Eq_Mic)
cv02_nonpro_oos_reswab(Eq_DAR, Eq_CAR, Eq_Mic)
Eq_DAR |
A dataframe containing equipment DAR data. |
Eq_CAR |
A dataframe containing equipment CAR data. |
Eq_Mic |
A dataframe containing equipment Mic data. |
A dataframe summarizing the non-process-related OOS and reswab data.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
cv02_nonpro_oos_reswab(Eq_DAR, Eq_CAR, Eq_Mic)
cv02_nonpro_oos_reswab(Eq_DAR, Eq_CAR, Eq_Mic)
This function takes a dataset and computes the percentage of residue over USL for each event, as well as mean and median of these percentages for each cleaning event and overall.
cv03_usl_unification(data, cleaning_event_col, residue_col, usl_col)
cv03_usl_unification(data, cleaning_event_col, residue_col, usl_col)
data |
A dataframe containing the relevant dataset. |
cleaning_event_col |
Name of the column in 'data' that contains the cleaning event identifiers as a string. |
residue_col |
Name of the column in 'data' that contains the residue measurements as a string. |
usl_col |
Name of the column in 'data' that contains the USL values as a string. |
A dataframe with original data and additional columns for residue percentages, and their mean and median values per cleaning event and overall.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
cv03_usl_unification(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_col = "DAR", usl_col = "USL")
cv03_usl_unification(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_col = "DAR", usl_col = "USL")
This function takes a dataset and a column representing the residue percentages and generates a histogram overlaid with a KDE (Kernel Density Estimate) curve. It calculates and marks quantiles P0.5, P0.8413, P0.9772, and the P0.99865, i.e., UCL (Upper Control Limit) on the plot.
cv04_histogram_kde(data, residue_pct_col)
cv04_histogram_kde(data, residue_pct_col)
data |
A dataframe containing the relevant dataset. |
residue_pct_col |
The name of the column in 'data' that contains the residue percentages. |
A ggplot object representing the histogram with KDE curve.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
Eq_DAR <- cv03_usl_unification(data=Eq_DAR,"CleaningEvent", "DAR", usl_col="USL") cv04_histogram_kde(data = Eq_DAR, residue_pct_col = "DAR_Pct")
Eq_DAR <- cv03_usl_unification(data=Eq_DAR,"CleaningEvent", "DAR", usl_col="USL") cv04_histogram_kde(data = Eq_DAR, residue_pct_col = "DAR_Pct")
This function performs the Shapiro-Wilk test for normality on a specified variable in a dataset. It returns a data frame with the variable name, the Shapiro-Wilk statistic, the p-value in scientific notation, and an indication of whether the p-value is less than 0.05.
cv05_sw_norm_test_1(data, residue_col)
cv05_sw_norm_test_1(data, residue_col)
data |
A data frame containing the dataset. |
residue_col |
The name of the column to test for normality. |
A data frame with the Shapiro-Wilk test results.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
data(Eq_Mic) cv05_sw_norm_test_1(data=Eq_Mic, residue_col="Mic")
data(Eq_Mic) cv05_sw_norm_test_1(data=Eq_Mic, residue_col="Mic")
This function performs the Shapiro-Wilk test for normality on two specified variables within a dataset. It returns a data frame with the variables' names, Shapiro-Wilk statistics, p-values in scientific notation, and indications of whether the p-values are less than 0.05.
cv06_sw_norm_test_2(data, residue_col, residue_pct_col)
cv06_sw_norm_test_2(data, residue_col, residue_pct_col)
data |
A data frame containing the dataset. |
residue_col |
The name of the first column to test for normality. |
residue_pct_col |
The name of the second column to test for normality. |
A data frame with Shapiro-Wilk test results for both variables.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# assuming Eq_DAR is a predefined dataset Eq_DAR <- cv03_usl_unification(data = Eq_DAR, "CleaningEvent", "DAR", "USL") cv06_sw_norm_test_2(data=Eq_DAR, residue_col="DAR", residue_pct_col="DAR_Pct")
# assuming Eq_DAR is a predefined dataset Eq_DAR <- cv03_usl_unification(data = Eq_DAR, "CleaningEvent", "DAR", "USL") cv06_sw_norm_test_2(data=Eq_DAR, residue_col="DAR", residue_pct_col="DAR_Pct")
This function creates a control chart and a density plot for the median residue percentages based on kernel density estimation.
cv07_median_control_chart(data, cleaning_event_col, residue_pct_median_col)
cv07_median_control_chart(data, cleaning_event_col, residue_pct_median_col)
data |
A data frame containing the data to plot. |
cleaning_event_col |
The name of the column containing cleaning event identifiers. |
residue_pct_median_col |
The name of the column containing the calculated median residue percentages. |
A cowplot object containing the combined control chart and density plot.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming 'Eq_DAR' is a data frame with appropriate columns: Eq_DAR <- cv03_usl_unification(data = Eq_DAR, "CleaningEvent", "DAR", "USL") cv07_median_control_chart(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_pct_median_col="DAR_Pct_Median")
# Assuming 'Eq_DAR' is a data frame with appropriate columns: Eq_DAR <- cv03_usl_unification(data = Eq_DAR, "CleaningEvent", "DAR", "USL") cv07_median_control_chart(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_pct_median_col="DAR_Pct_Median")
This function creates a control chart for the median residue percentages based on kernel density estimation. The in put residue_pct_meidan_col can be median of non-USL_unified variable such as Mic_Median, DAR_Median, or CAR_Median
cv071_median_control_chart(data, cleaning_event_col, residue_pct_median_col)
cv071_median_control_chart(data, cleaning_event_col, residue_pct_median_col)
data |
A data frame containing the data to plot. |
cleaning_event_col |
The name of the column containing cleaning event identifiers. |
residue_pct_median_col |
The name of the column containing the calculated median residue percentages. |
The meidan control chart.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming 'Eq_DAR' is a data frame with appropriate columns: Eq_DAR <- cv03_usl_unification(data = Eq_DAR, "CleaningEvent", "DAR", "USL") cv07_median_control_chart(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_pct_median_col="DAR_Pct_Median")
# Assuming 'Eq_DAR' is a data frame with appropriate columns: Eq_DAR <- cv03_usl_unification(data = Eq_DAR, "CleaningEvent", "DAR", "USL") cv07_median_control_chart(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_pct_median_col="DAR_Pct_Median")
This function generates a variability chart for cleaning events, showing data points, outliers, and overall statistics like the grand mean and median.
cv08_variability_chart(data, cleaning_event_col, residue_pct_col, usl_pct_col)
cv08_variability_chart(data, cleaning_event_col, residue_pct_col, usl_pct_col)
data |
A data frame containing the data to plot. |
cleaning_event_col |
Name of the column representing cleaning events (as a string). |
residue_pct_col |
Name of the column representing residue percentages (as a string). |
usl_pct_col |
Name of the column representing the upper specification limit percentages (as a string). |
A ggplot object representing the variability chart.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
Eq_DAR <- cv01_dfclean(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL" ) Eq_DAR <- cv03_usl_unification(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL") cv08_variability_chart(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_pct_col="DAR_Pct", usl_pct_col="USL_Pct")
Eq_DAR <- cv01_dfclean(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL" ) Eq_DAR <- cv03_usl_unification(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL") cv08_variability_chart(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_pct_col="DAR_Pct", usl_pct_col="USL_Pct")
Perform Kruskal-Wallis test for residue percentages based on cleaning events.
cv09_kw_test(data, residue_col, cleaning_event_col)
cv09_kw_test(data, residue_col, cleaning_event_col)
data |
A data frame containing the data. |
residue_col |
The name of the column containing residue percentages. |
cleaning_event_col |
The name of the column containing cleaning event identifiers. |
A data frame of Kruskal-Wallis test results.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming 'Eq_DAR' is the data frame, 'DAR_Pct' is the residue column, # and 'CleaningEvent' is the cleaning event column. Eq_DAR <- cv01_dfclean(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL" ) Eq_DAR <- cv03_usl_unification(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL") kw_test_results <- cv09_kw_test(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent")
# Assuming 'Eq_DAR' is the data frame, 'DAR_Pct' is the residue column, # and 'CleaningEvent' is the cleaning event column. Eq_DAR <- cv01_dfclean(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL" ) Eq_DAR <- cv03_usl_unification(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL") kw_test_results <- cv09_kw_test(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent")
Perform Dunn's test for residue based on cleaning events. Choose the control group as the cleaning event whose median is closest to the grand median. This function is for investigation purpose.
cv10_dunn_test_vs_control(data, residue_col, cleaning_event_col)
cv10_dunn_test_vs_control(data, residue_col, cleaning_event_col)
data |
A data frame containing the data. |
residue_col |
The name of the column containing residue. |
cleaning_event_col |
The name of the column containing cleaning event identifiers. |
A data frame of Dunn's test results with control group.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# 'Eq_DAR' is the data frame, 'DAR_Pct' is the residue column, and # 'CleaningEvent' is the cleaning event column. Eq_DAR <- cv03_usl_unification(data = Eq_DAR, residue_col = "DAR", cleaning_event_col = "CleaningEvent", usl_col = "USL") dunn_test_results_vs_control <- cv10_dunn_test_vs_control(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent")
# 'Eq_DAR' is the data frame, 'DAR_Pct' is the residue column, and # 'CleaningEvent' is the cleaning event column. Eq_DAR <- cv03_usl_unification(data = Eq_DAR, residue_col = "DAR", cleaning_event_col = "CleaningEvent", usl_col = "USL") dunn_test_results_vs_control <- cv10_dunn_test_vs_control(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent")
Perform Variability Components Analysis (VCA) by median for residue percentages based on cleaning events with bootstrap for confidence intervals.
cv11_vca_by_median(data, residue_col, cleaning_event_col, n_bootstrap = 2000)
cv11_vca_by_median(data, residue_col, cleaning_event_col, n_bootstrap = 2000)
data |
A data frame containing the data. |
residue_col |
The name of the column containing residue percentages. |
cleaning_event_col |
The name of the column containing cleaning event identifiers. |
n_bootstrap |
The number of bootstrap iterations. Default is 2000. |
A data frame summarizing variability components analysis by median along with confidence intervals from bootstrap.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming 'Eq_DAR' is the data frame, 'DAR_Pct' is the residue column, # and 'CleaningEvent' is the cleaning event column. Eq_DAR <- cv01_dfclean(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL" ) Eq_DAR <- cv03_usl_unification(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL") summary <- cv11_vca_by_median(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent", n_bootstrap = 2000)
# Assuming 'Eq_DAR' is the data frame, 'DAR_Pct' is the residue column, # and 'CleaningEvent' is the cleaning event column. Eq_DAR <- cv01_dfclean(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL" ) Eq_DAR <- cv03_usl_unification(data=Eq_DAR, cleaning_event_col="CleaningEvent", residue_col="DAR", usl_col="USL") summary <- cv11_vca_by_median(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent", n_bootstrap = 2000)
Calculate PPU using KDE density estimation
cv12_kde_ppu( data, residue_col, cleaning_event_col, usl_col, n_bootstrap = 1000 )
cv12_kde_ppu( data, residue_col, cleaning_event_col, usl_col, n_bootstrap = 1000 )
data |
The dataset containing the columns specified in other parameters. |
residue_col |
The name of the column containing residue data. |
cleaning_event_col |
The name of the column containing cleaning event data (unused). |
usl_col |
The name of the column containing USL values. |
n_bootstrap |
The number of bootstrap samples to use. |
A dataframe with the estimated PPU and its 95
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
Eq_DAR <- cv03_usl_unification(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_col = "DAR", usl_col = "USL") cv12_kde_ppu(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent", usl_col = "USL_Pct", n_bootstrap = 1000)
Eq_DAR <- cv03_usl_unification(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_col = "DAR", usl_col = "USL") cv12_kde_ppu(data = Eq_DAR, residue_col = "DAR_Pct", cleaning_event_col = "CleaningEvent", usl_col = "USL_Pct", n_bootstrap = 1000)
Conducts a goodness-of-fit test to evaluate if the Mic data follows a Poisson distribution.
cv13_poisson_test(data, residue_col)
cv13_poisson_test(data, residue_col)
data |
A dataframe containing the observed data. |
residue_col |
A string specifying the column in 'data' to be tested. |
A dataframe object representing the chi-squared statistic and the p-value from the goodness-of-fit test.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming Eq_Mic is your dataframe and Mic is the column to be tested cv13_poisson_test(data=Eq_Mic, residue_col="Mic")
# Assuming Eq_Mic is your dataframe and Mic is the column to be tested cv13_poisson_test(data=Eq_Mic, residue_col="Mic")
Performs a dispersion test on a Poisson regression model to check for overdispersion. The function fits a Poisson regression model to the data using the specified columns, and then performs a dispersion test using the model.
cv14_dispersion_test(data, residue_col, cleaning_event_col)
cv14_dispersion_test(data, residue_col, cleaning_event_col)
data |
A dataframe containing the observed data. |
residue_col |
A string specifying the response column in the model. |
cleaning_event_col |
A string specifying the explanatory variable in the model. |
A dataframe object with the results of the overdispersion test, including
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com] the Z-value, P-value, and dispersion estimate.
cv14_dispersion_test(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent")
cv14_dispersion_test(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent")
Calculate Mic Statistics
cv15_mic_mutate(data, cleaning_event_col, residue_col)
cv15_mic_mutate(data, cleaning_event_col, residue_col)
data |
A dataframe containing the data. |
cleaning_event_col |
The name of the column that identifies the cleaning event. |
residue_col |
The name of the column containing residue measurements. |
A dataframe with new columns for mean, median, grand mean, and grand median of Mic values.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
cv15_mic_mutate(data=Eq_Mic, cleaning_event_col="CleaningEvent", residue_col="Mic")
cv15_mic_mutate(data=Eq_Mic, cleaning_event_col="CleaningEvent", residue_col="Mic")
This function generates a u-chart for visualizing the stability and capability of a process based on a Poisson distribution.
cv16_u_chart(data, residue_col, cleaning_event_col)
cv16_u_chart(data, residue_col, cleaning_event_col)
data |
A data frame containing the data for plotting. |
residue_col |
The name of the column representing residue data (numeric). |
cleaning_event_col |
The name of the column representing cleaning events (factor or character). |
A ggplot object representing the u-chart.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
cv16_u_chart(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent")
cv16_u_chart(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent")
This function computes the cumulative sum (CUSUM) for the mean values of a specified residue column aggregated by a cleaning event column. It then generates a CUSUM chart for visualizing the stability of a process based on a Poisson distribution. The reference value 'k' can be provided; if not, it defaults to half of the process average lambda.
cv17_cusum(data, residue_col, cleaning_event_col, k = NULL)
cv17_cusum(data, residue_col, cleaning_event_col, k = NULL)
data |
A data frame containing the dataset for analysis. |
residue_col |
The name of the column representing residue data. |
cleaning_event_col |
The name of the column representing cleaning events. |
k |
The reference value used in calculating CUSUM, by default it is set to half of lambda. |
A ggplot object representing the CUSUM chart.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# To create a CUSUM chart with default k value cv17_cusum(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent") # To create a CUSUM chart with a specified k value cv17_cusum(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent", k = 0.75)
# To create a CUSUM chart with default k value cv17_cusum(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent") # To create a CUSUM chart with a specified k value cv17_cusum(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent", k = 0.75)
Generates an EWMA chart for a specified residue column grouped by cleaning events.
cv18_ewma(data, residue_col, cleaning_event_col, alpha = 0.2)
cv18_ewma(data, residue_col, cleaning_event_col, alpha = 0.2)
data |
A data frame containing the data set for analysis. |
residue_col |
The name of the column representing residue data. |
cleaning_event_col |
The name of the column representing cleaning events. |
alpha |
The smoothing parameter for the EWMA calculation, default is 0.2. |
A ggplot object representing the EWMA chart.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming 'Eq_Mic' is a data frame, 'Mic' is the residue column of interest, # and 'CleaningEvent' is the column representing cleaning events. ewma_plot <- cv18_ewma(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent") print(ewma_plot)
# Assuming 'Eq_Mic' is a data frame, 'Mic' is the residue column of interest, # and 'CleaningEvent' is the column representing cleaning events. ewma_plot <- cv18_ewma(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent") print(ewma_plot)
Fits a simple Poisson model to the data and returns a data frame containing the model's term, estimate, standard error, z value, and p-value, formatted to a fixed number of decimal places.
cv19_poisson_simple(data, residue_col)
cv19_poisson_simple(data, residue_col)
data |
A data frame containing the data set for analysis. |
residue_col |
The name of the column representing residue data. |
A data frame with the formatted summary of the Poisson regression model.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming 'Eq_Mic' is a data frame and 'Mic' is the residue column of interest. cv19_poisson_simple(data = Eq_Mic, residue_col = "Mic")
# Assuming 'Eq_Mic' is a data frame and 'Mic' is the residue column of interest. cv19_poisson_simple(data = Eq_Mic, residue_col = "Mic")
Fits a fixed effects Poisson model and returns a data frame with the summary. If the p-value is significant, then the corresponding cleaning event is significantly different from other cleaning events. For a stable cleaning process, we wish all p-values are not significant.
cv20_poisson_fixed(data, residue_col, cleaning_event_col)
cv20_poisson_fixed(data, residue_col, cleaning_event_col)
data |
Data frame containing the data. |
residue_col |
The name of the residue column. |
cleaning_event_col |
The name of the cleaning event column. |
A data frame output with the fixed effect summary.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
fixed_effect_summary <- cv20_poisson_fixed(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent")
fixed_effect_summary <- cv20_poisson_fixed(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent")
Fits a mixed-effects Poisson model to the data and returns a data frame containing the fixed effect part estimates, standard errors, z-values, and p-values.
cv21_poisson_mixed(data, residue_col, cleaning_event_col)
cv21_poisson_mixed(data, residue_col, cleaning_event_col)
data |
A data frame containing the data set for analysis. |
residue_col |
A string specifying the column in 'data' that contains residue data. |
cleaning_event_col |
A string specifying the column in 'data' for random effects grouping. |
A data frame with the fixed effect summary of the mixed-effects Poisson regression model.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming 'Eq_Mic' is a data frame, 'Mic' is the residue column of interest, # and 'CleaningEvent' is the column for random effects grouping. mixed_effect_summary <- cv21_poisson_mixed(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent") print(mixed_effect_summary)
# Assuming 'Eq_Mic' is a data frame, 'Mic' is the residue column of interest, # and 'CleaningEvent' is the column for random effects grouping. mixed_effect_summary <- cv21_poisson_mixed(data = Eq_Mic, residue_col = "Mic", cleaning_event_col = "CleaningEvent") print(mixed_effect_summary)
This function fits a Poisson mixed-effects model with a specified random effect and extracts the variances and standard deviations of the random effects.
cv22_var_random_effect(data, residue_col, cleaning_event_col)
cv22_var_random_effect(data, residue_col, cleaning_event_col)
data |
A data frame containing the data. |
residue_col |
The name of the residue column. |
cleaning_event_col |
The name of the column used for random effects grouping. |
A data frame with the variances and standard deviations of the random effects.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
ra_table <- cv22_var_random_effect(data=Eq_Mic, residue_col="Mic", cleaning_event_col="CleaningEvent")
ra_table <- cv22_var_random_effect(data=Eq_Mic, residue_col="Mic", cleaning_event_col="CleaningEvent")
This function fits a Poisson mixed-effects model with a specified random effect and extracts the random effect coefficients and their standard deviations.
cv23_random_effect_coef(data, residue_col, cleaning_event_col)
cv23_random_effect_coef(data, residue_col, cleaning_event_col)
data |
A data frame containing the data. |
residue_col |
The name of the residue column. |
cleaning_event_col |
The name of the column used for random effects grouping. |
A data frame with the random effect coefficients and standard deviations.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
re_coefs <- cv23_random_effect_coef(data=Eq_Mic, residue_col="Mic", cleaning_event_col="CleaningEvent")
re_coefs <- cv23_random_effect_coef(data=Eq_Mic, residue_col="Mic", cleaning_event_col="CleaningEvent")
This function performs a variance component analysis using a mixed-effects model with a Poisson distribution to estimate within-group and between-group variance for microbial counts data. Assumes data is grouped by cleaning events and evaluates the residue or microbial counts within these groups.
cv24_vca_mic(data, residue_col, cleaning_event_col)
cv24_vca_mic(data, residue_col, cleaning_event_col)
data |
A data frame containing the dataset. |
residue_col |
The name of the column in 'data' that contains the residue or microbial counts. |
cleaning_event_col |
The name of the column in 'data' that contains the grouping factor for cleaning events. |
A data frame summarizing the variance components, including within-group variance, between-group variance, and total variance, along with their percentages and standard deviations.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming `Eq_Mic` is your dataframe, `Mic` is the microbial counts column, # and `CleaningEvent` is the cleaning event column: cv24_vca_mic(Eq_Mic, "Mic", "CleaningEvent")
# Assuming `Eq_Mic` is your dataframe, `Mic` is the microbial counts column, # and `CleaningEvent` is the cleaning event column: cv24_vca_mic(Eq_Mic, "Mic", "CleaningEvent")
Performs a process performance calculation using binomial distribution. This includes a bootstrap procedure to estimate the confidence interval of the Process Performance Index (Ppu).
cv25_qbinom_ppu(data, residue_col, cleaning_event_col, usl_col)
cv25_qbinom_ppu(data, residue_col, cleaning_event_col, usl_col)
data |
A data frame containing the dataset. |
residue_col |
Name of the column in 'data' that contains the residue or defect counts. |
cleaning_event_col |
Name of the column in 'data' that groups data by cleaning event. |
usl_col |
Name of the column in 'data' that contains the Upper Specification Limit (USL) for each group. |
A data frame with the calculated Ppu and its 95 along with the method used ("Q-Binomial").
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
# Assuming `data` is the dataframe with columns "Residue", "CleaningEvent", and "USL": cv25_qbinom_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
# Assuming `data` is the dataframe with columns "Residue", "CleaningEvent", and "USL": cv25_qbinom_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
This function calculates the Process Performance Index (Ppu) for data assumed to follow a Poisson distribution. It includes a bootstrap method for estimating the confidence interval of the Ppu.
cv26_qpoisson_ppu(data, residue_col, cleaning_event_col, usl_col)
cv26_qpoisson_ppu(data, residue_col, cleaning_event_col, usl_col)
data |
A data frame containing the dataset. |
residue_col |
Name of the column in 'data' containing residue counts. |
cleaning_event_col |
Name of the column in 'data' used to group data by cleaning event. |
usl_col |
Name of the column in 'data' that contains the Upper Specification Limit (USL) for each group. |
A data frame with columns Method, Ppu, CI_Lower, and CI_Upper.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
cv26_qpoisson_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
cv26_qpoisson_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
Calculates the Process Performance Index (Ppu) using Anscombe's transformation. This function also performs a bootstrap to estimate the confidence interval of Ppu.
cv27_anscombe_ppu(data, residue_col, cleaning_event_col, usl_col)
cv27_anscombe_ppu(data, residue_col, cleaning_event_col, usl_col)
data |
A data frame containing the dataset. |
residue_col |
Name of the column in 'data' containing residue or defect counts. |
cleaning_event_col |
Name of the column in 'data' used for grouping by cleaning event. |
usl_col |
Name of the column in 'data' that contains the Upper Specification Limit (USL). |
A data frame with columns for the Method, Ppu, CI_Lower, and CI_Upper.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
cv27_anscombe_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
cv27_anscombe_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
This function calculates the Process Performance Index (Ppu) using Freeman's transformation, including a bootstrap method to estimate the confidence interval of Ppu.
cv28_freeman_ppu(data, residue_col, cleaning_event_col, usl_col)
cv28_freeman_ppu(data, residue_col, cleaning_event_col, usl_col)
data |
A data frame containing the dataset. |
residue_col |
The name of the column in 'data' containing residue or defect counts. |
cleaning_event_col |
The name of the column in 'data' used for grouping data by cleaning event. |
usl_col |
The name of the column in 'data' that contains the Upper Specification Limit (USL). |
A data frame with columns for the Method, Ppu, CI_Lower, and CI_Upper.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
cv28_freeman_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
cv28_freeman_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
This function calculates the process performance index (Ppu) for Mic using five different methods, including Q-Binomial, Q-Poisson, Anscombe, Freeman, and KDE. It returns a dataframe with the Ppu values, lower and upper confidence intervals for each method, and appends a row for the method with the minimum Ppu value.
cv29_mic_ppu(data, residue_col, cleaning_event_col, usl_col)
cv29_mic_ppu(data, residue_col, cleaning_event_col, usl_col)
data |
A dataframe containing the dataset. |
residue_col |
The name of the column in 'data' that contains the residue values. |
cleaning_event_col |
The name of the column in 'data' that contains the cleaning event identifiers. |
usl_col |
The name of the column in 'data' that contains the Upper Specification Limit values. |
A dataframe with the Ppu calculations for each method and the minimum Ppu method.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
MicPPU <- cv29_mic_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
MicPPU <- cv29_mic_ppu(Eq_Mic, "Mic", "CleaningEvent", "USL")
This function calculates Ppu values for DAR, CAR, and Mic using the KDE method provided by the 'cv12_kde_ppu' function. It then uses the 'cv29_mic_ppu' function to calculate combined Ppu for Mic and extract the method with the minimum Ppu value. The function assumes the availability of the datasets 'Eq_DAR', 'Eq_CAR', and 'Eq_Mic' that conform to expected column naming conventions and data structures. It is reliant on the results of the 'cv12_kde_ppu' and 'cv29_mic_ppu' functions being consistent and correctly formatted.
cv30_dar_car_mic_ppu( dar_data, dar_residue_col, dar_cleaning_event_col, dar_usl_col, car_data, car_residue_col, car_cleaning_event_col, car_usl_col, mic_data, mic_residue_col, mic_cleaning_event_col, mic_usl_col )
cv30_dar_car_mic_ppu( dar_data, dar_residue_col, dar_cleaning_event_col, dar_usl_col, car_data, car_residue_col, car_cleaning_event_col, car_usl_col, mic_data, mic_residue_col, mic_cleaning_event_col, mic_usl_col )
dar_data |
A dataframe containing DAR data. |
dar_residue_col |
The name of the DAR residue column. |
dar_cleaning_event_col |
The name of the DAR cleaning event identifier column. |
dar_usl_col |
The name of the DAR Upper Specification Limit column. |
car_data |
A dataframe containing CAR data. |
car_residue_col |
The name of the CAR residue column. |
car_cleaning_event_col |
The name of the CAR cleaning event identifier column. |
car_usl_col |
The name of the CAR Upper Specification Limit column. |
mic_data |
A dataframe containing Mic data. |
mic_residue_col |
The name of the Mic residue column. |
mic_cleaning_event_col |
The name of the Mic cleaning event identifier column. |
mic_usl_col |
The name of the Mic Upper Specification Limit column. |
A dataframe with Ppu values for DAR, CAR, and Mic, along with the Overall Minimum Ppu.
Chan, Mohamed, Lou, Wendy, Yang, Xiande [xiande.yang at gmail.com]
Eq_DAR <- cv03_usl_unification(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_col = "DAR", usl_col = "USL") Eq_CAR <- cv03_usl_unification(data = Eq_CAR, cleaning_event_col = "CleaningEvent", residue_col = "CAR", usl_col = "USL") df1 <- cv30_dar_car_mic_ppu( dar_data = Eq_DAR, dar_residue_col = "DAR_Pct", dar_cleaning_event_col = "CleaningEvent", dar_usl_col = "USL_Pct", car_data = Eq_CAR, car_residue_col = "CAR_Pct", car_cleaning_event_col = "CleaningEvent", car_usl_col = "USL_Pct", mic_data = Eq_Mic, mic_residue_col = "Mic", mic_cleaning_event_col = "CleaningEvent", mic_usl_col = "USL")
Eq_DAR <- cv03_usl_unification(data = Eq_DAR, cleaning_event_col = "CleaningEvent", residue_col = "DAR", usl_col = "USL") Eq_CAR <- cv03_usl_unification(data = Eq_CAR, cleaning_event_col = "CleaningEvent", residue_col = "CAR", usl_col = "USL") df1 <- cv30_dar_car_mic_ppu( dar_data = Eq_DAR, dar_residue_col = "DAR_Pct", dar_cleaning_event_col = "CleaningEvent", dar_usl_col = "USL_Pct", car_data = Eq_CAR, car_residue_col = "CAR_Pct", car_cleaning_event_col = "CleaningEvent", car_usl_col = "USL_Pct", mic_data = Eq_Mic, mic_residue_col = "Mic", mic_cleaning_event_col = "CleaningEvent", mic_usl_col = "USL")
A dataset containing cleaning validation data for equipment CAR.
Eq_CAR
Eq_CAR
A data frame with 30 rows and 3 variables:
Numeric vector with CAR measurements.
Numeric vector with Upper Specification Limits for CAR.
Factor vector with Cleaning Event identifiers.
Character vector with the deviation status for each cleaning event. Defaults to "normal".
Integer or character vector with unique project IDs assigned to each row.
Details about the data source.
A dataset containing cleaning validation data for equipment DAR.
Eq_DAR
Eq_DAR
A data frame with 60 rows and 3 variables:
Numeric vector with DAR measurements.
Numeric vector with Upper Specification Limits for DAR.
Factor vector with Cleaning Event identifiers.
Character vector with the deviation status for each cleaning event. Defaults to "normal".
Integer or character vector with unique project IDs assigned to each row.
Details about the data source.
A dataset containing cleaning validation data for microbial bioburden (Mic).
Eq_Mic
Eq_Mic
A data frame with 20 rows and 3 variables:
Numeric vector with Mic measurements.
Numeric vector with Upper Specification Limits for Mic.
Factor vector with Cleaning Event identifiers.
Character vector with the deviation status for each cleaning event. Defaults to "normal".
Integer or character vector with unique project IDs assigned to each row.
Details about the data source.