Data pre-processing for process optimization at a drinking water treatment plant in Ugu District Municipality, South Africa
MetadataShow full item record
When testing and recording water quality data from treatment plants, errors arise. The errors are in the form of re-cordings left blank (missing values), obvious errors in writing or typing, or they can be as a result of values being very small to detect and are therefore censored. The censored values are known to be below the limit of detection (LOD). In statistical analysis, the blank cells can be filled with a certain value. Censored values are often corrected by substituting with a constant value throughout. This value will be a fraction of the limit of detection and most commonly used frac-tions are, half the limit of detection, the limit of detection divided by the square root of 2, or multiplying the limit of detection by 0.75. The direct substitution method for handling missing and values below the limit of detection results in a uniform distribution for values below the limit of detection, and a true distribution for those above. As a result, treat-ment of the values below the limit of detection is dependent upon their percentage in the sample size. An alternative method used will mimic the characteristic of the distribution pattern of the values above the limit of detection to esti-mate the values below it. This can be done with an extrapolation technique or maximum likelihood estimation. In this study, data from the Umzinto Water Treatment Plant was used to develop a data pre-processing program using Visual Basics for Applications (VBA) and Microsoft Excel 2013. The procedure involved 4 stages: data preparation, data pre-processing for blanks and non-detects, data pre-processing for the censored values and finally the identifica-tion of the outliers. The developed program was then used to pre-process raw water quality data, which resulted in satisfactory process time and data conversion. The methodology used can be borrowed for the pre-processing of data driven environmental models and hence it has a great influence on sustainability of water treatment plants.
Magombo, J., Dzwairo, B., Moyo, S. & Dewa, M. 2015. Data pre-processing for process optimization at a drinking water treatment plant in Ugu District Municipality, South Africa. Environmental Economics. 6(1): 159-171.