Smoothing Methods for Continuous Permeation Data Measured Discretely Designated for Quick Evaluation of Barrier Materials

: Information concerning barrier properties of materials used for production of personal protective equipment fundamentally not only affects their useful properties, but also supports the end users’ decisions. Data obtained from measurements by standardized methods have to be processed by appropriate statistical methods. The article deals with numerical and statistical methods of reconstruction of a continuous curve out of discretely measured data. These mathematical models are proposed as an extension to the manual measurement of a conductometric method. Behaviour of these models is demonstrated on two chemical warfare agent simulants, however, the proposed methodology is universal and can be applied also to chemicals which have no conductivity. The parametric and nonparametric models are studied in the curve reconstruction, as well as in the calculation of subsequent characteristics, for example, a lag-time. The nonparametric model shows the best results which are in accordance with an expert’s estimate.


Introduction
Permeable and insulating barrier materials are designed to protect the user's body surface against the effects of toxic substances.Their basic purpose is to separate the outdoor area with the presence of a mass concentration of chemical warfare agents (CWA) or toxic industrial chemicals (TIC) from the area where the toxic substance cannot appear.In 208 DOI 10.3849/aimt.01826 this context, it should be understood that a very small amount of CWA can cause irreversible loss or even the death of the protective device user.The long-term protection against the permeation of toxic substances is thus determined by the concentration of the toxic substance and the type of used barrier material.Due to the different chemical and physical-chemical characteristics of CWA and TIC, it is currently not possible to find such a breathable or insulating barrier material that meets high chemical resistance requirements while meeting other requirements associated with the long-term activity of military forces in the contaminated area [1][2][3][4][5].It will always be a compromise between the provided level of protection and the ability to perform an operational task [6][7][8].
Methods that have been used in the recent time for determination the chemical resistance of barrier materials against permeation of toxic substances have been largely based on the principle of colorimetric reaction and subsequent colour response of the detection layer.Although these methods were accurate enough and provided very good and realistic data concerning the breakthrough time of a particular barrier material, they did not provide the data required by European standards.In the case of colorimetric methods or methods based on a chemical principle, the breakthrough time is expressed as the time from the beginning of contamination of the barrier material with a particular CWA or TIC to the point where the first detectable, thus visible change was registered.This usually corresponds to the so-called threshold dose.In the case of CWA, the data were available because they were surveyed in a targeted and long-term manner.However, in the context of a change in the security environment within the North Atlantic Treaty Organization and a different specification of security threats, it was necessary to develop methods and methodologies that would allow the assessment of barrier materials against the permeation of TIC where information related to threshold doses is completely absent [9][10][11].
Currently, there is a growing demand for rapid and efficient information sharing in all types of military operations.Commanders' and staffs' information support is directed not only to provide information and data concerning classical combat operations of the units, but also to provide information pointing to the level of provided antigas protection.If there is a possibility to quickly assess the ability of barrier materials to resist contamination caused by CWA or TIC, the commander can make an immediate decision to adjust the regime of work in the contaminated area, and to prepare preventive measures to ensure rest and rotation of soldiers and so on.

Aim of the Contribution
To evaluate the permeation of toxic chemicals through the barrier material, the values shown in the cumulative amount versus time graph or the CWA or TIC concentration versus time graph are essential.The part of the permeation curve from which the steadystate permeation rate can be read is critical for determining the chemical resistance value using lag-time.The currently used permeation calculator [12] makes it possible to read the value of the breakthrough time; however, only by designating manually a specific place on the permeation curve.The lag-time value is then read from the linear part of the graph of the permeation mass depending on time.This reading is also done manually by clicking on the graph to enter a line that is tangent to the linear part of the graph and its intersection with the time axis determines the lag-time value.
Since the measurement usually takes place at set time intervals, the obtained values of measurement of the above mentioned characteristics are in the form of discrete recordings of continuous quantities.The first step is to find a continuous function that appropriately represents the empirical data.Assuming such a function exists, the function can be used for modelling and it can be further developed, which is better than working with a set of discrete values.In particular, it is possible to calculate the value of the breakthrough time exactly and to construct a tangent line of the function analytically as well.In such cases, the application of smoothing methods, either parametric or nonparametric, seems to be very suitable.
The main objective of this article is to provide a basis for an automation of the measuring procedure so that it is not necessary to determine values from the permeation curve manually.Sub-objectives are then to propose an optimal way of estimating a continuous function, which is hidden in the measured discrete values [13][14][15][16] and to provide a mathematical-statistical background for calculations of the residence time and other permeation characteristics acquired from the permeation profile.

Data and Methods
A number of methods are currently used to test the chemical resistance of barrier materials.These methods work on different analytical principles.In this article, we focus on methods that are based on conductivity.
One of the variables that is useful as a starting point for the evaluation of the permeation of toxic substances through barrier materials is conductivity.The change of conductivity corresponds to the concentration of dissociated ions of used test chemical in a welldefined volume of redistilled water.After calibration of the measuring electrode response to a specific and very precise amount (volume) of the used test chemical, it is possible to obtain information about the change in conductivity of the specific exact volume of the redistilled water.Within the practical implementation of the permeation measurement, the used test chemical, which is diffused through the tested barrier material, is absorbed into the redistilled water.After dissociation, the conductivity is determined and converted to concentration by the permeation calculator.When measuring the permeation of toxic substances under static conditions, the concentration of the test chemical in the redistilled water gradually increases, and so does the concentration of measurable dissociated ions [17].
KONDUKTOTEST is a method for the quick determination of the resistance of porous (filtration) and non-porous (insulation) barrier materials to the static permeation of sulphur mustard and other volatile toxic compounds with acidobasic properties which are soluble in water on ion dilution.The KONDUKTOTEST device, which was developed at the Technical Research Institute of Protection in Brno (Czech Republic), is a semi-automatic device enabling continuous monitoring of the permeation of volatile toxic substances through the materials of insulating and filtering protective equipment using a conductivity sensor consisting of carbon electrodes [18].
The use of the KONDUKTOTEST method fully respects the requirements based on the standard classical analytical method of conductivity.The general principle of conductometry is applied, however, the test chemical dissociates into ions only after it has penetrated the barrier material being tested.This leads to a more versatile approach in the application of a wide range of methods useful for barrier material testing.
For the purpose of evaluation of the resistance of barrier materials against the permeation of toxic substances, the start of the permeation curve (see Fig. 1), which corresponds DOI 10.3849/aimt.01826 to the state where no toxic substance has permeated through the barrier material, is essential.This corresponds to the time when the toxic substance was not dissociated in the redistilled water, thus no response of the measuring system was recorded.The linear part of the graph corresponds to the state where the toxic substance already permeates intensively through the tested material and is used as a tool for reading the lag-time value.For the purpose of evaluation of the resistance of barrier materials against permeation of toxic substances, this is the time when the amount of the dissociated test chemical is detected that would cause death or disqualification of a user of the antigas protective equipment.The last part of the graph, thus reaching steady state permeation rate, expresses the limit of the detection system to react to the detected amount of the permeated toxic substance.It has no longer any practical significance for the purposes of assessing the resistance of barrier materials under military conditions.Achieving this state indicates a situation where the permeation measurement can be terminated.Chemical warfare agents are chemical compounds and mixtures which within their use in combat operations can kill, seriously injure or incapacitate persons, contaminate the environment, people, weapons, objects and other material.Due to the very high toxicity and hazardousness of this type of chemicals and the necessity to respect very strict safety measures related to health protection, it is advisable to use substitute test substances whose behaviour is as close as possible to the behaviour of CWA.In this study, we have used two substitute substances, namely cyclohexylamine and pyridine.
Cyclohexylamine is an organic compound belonging to the class of aliphatic amines.It is a colourless liquid.Samples are often coloured due to chemical impurities.However, colour (chemical purity) has no effect on permeation measurements.Cyclohexylamine is characterised by its fishy smell.However, for the purpose of conductivity measurements it is very important that it is infinitely miscible with water, which means that it can form aqueous solutions with a very high conductivity response.In practice, this means that in an aqueous environment, molecules dissociate very rapidly into ions, which can then be detected by a conductivity measuring device.Such behaviour corresponds to the characteristics of those CWA and TIC that have the same or similar chemical properties, thus those that dissociate unrestrictedly in water.For this reason, cyclohexylamine is considered a standard test chemical for the evaluation of barrier properties in relation to those toxic substances that dissociate unrestrictedly in the aquatic environment.
Pyridine is an organic aromatic heterocyclic compound of carbon, hydrogen and nitrogen.Under normal conditions, it is a characteristically odourless, colourless, flammable liquid miscible with water and ethanol.In this case, however, unlimited miscibility is not associated with an unlimited ability to dissociate into ions detectable by the measuring conductometric device.The use of this type of chemical is based on the fact that some CWA and TIC dissociate in aqueous environments only to a limited extent, thus, in quantities and concentrations approximately equivalent to the dissociation capacity of pyridine.For this reason, pyridine is considered a standard test chemical substance for the evaluation of barrier properties in relation to those toxic substances that dissociate only to a limited extent in the aquatic environment.

Models of the Measured Data
Regression models provide a suitable approach to modelling the data, i.e., modelling the dependence of concentration on time.A standard regression model is defined as where c i is the concentration of the chemical substance at fixed time points t i , F is the regression function, a = (a 1 , . . ., a k ) denotes a vector of unknown parameters, and ε i are independent and identically distributed random variables [19].
Parameters a of the model are selected using the least square approach, i.e., minimising the square of the vertical distance between the measured concentration values and the estimated values a = arg min a∈R k c i − F(t i , a) 2

Piecewise Linear Model
Since the focus of the research is on the initial and middle part of the curve, the simplest piecewise linear estimate of the following form could be used: where a 1 , a 2 denote the parameters of the function and z is the breakpoint of the function.Function (•) + is defined as Parametric Model Two models can be differed here, linear and nonlinear, depending on the linearity in the parameters.The linear model has one basic configuration; the nonlinear model is more flexible and opens door to a great number of possible forms.From nonlinear model candidates, we selected the logistic curve model, which is in accordance with the typical shape of the concentration curve.The model takes the form where a 1 denotes the curve maximum value, a 2 is known as the logistic growth rate and a 3 denotes the curve midpoint [20].DOI 10.3849/aimt.01826

Nonparametric Model
A nonparametric model can be written in the same simple form as the parametric model, i.e., c(t) = F(t) + ε.The shape of the function F draws a distinction between the parametric and nonparametric models.In the parametric approach, it is assumed that F has a prespecified functional form.On the other hand, nonparametric estimates attempt to reconstruct the unknown function from the data themselves using as few assumptions as possible.
Such a model can be fitted, among others, by using a kernel regression model with local linear polynomials.It involves solving the least square problem with c i standing for the measured concentration values at times t i , letters b 0 , b 1 stand for the unknown parameters and W i (t, h) is a weight function that depends on parameter h [21].
The kernel regression model uses a function called kernel as the weight function.The kernel is usually considered to be a symmetric nonnegative function which integrates to one.Another parameter used in the kernel estimate is h, which is called a bandwidth; it influences the smoothness of the resulting estimate [22][23][24].
Specifically, we have opted for the weight function based on the Gaussian density and smoothing parameter h, which minimizes a quantity based on Kullback-Leibler crossvalidation criterion [25,26].
Then the estimate of the regression function at point t takes the form: where Assessing the Quality of the Models For assessing the goodness-of-fit of the proposed models, a criterion which allows to compare parametric and nonparametric models together has to be applied.Therefore, an alternatively defined coefficient of determination is used [27].It allows comparing parametric and nonparametric estimates of the linear and nonlinear regression functions.Denoting c i the fitted value for the outcome c i and c the mean of the outcomes, the coefficient of determination is defined as The value of R 2 lies in the interval from zero to one with the value 1 indicating the perfect fit.

Lag-time
The lag-time value is estimated as the intersection of the tangent line to the linear part of the concentration curve and the time axis.Regarding the general shape of the concentration curve, the linear part is in the neighbourhood of the inflection point, i.e., the point where the first derivative reaches its maximum value.(The minimum value of the first derivative, which also could lead to the inflection point, is not considered because of the overall behaviour of the data.)Therefore, we need to calculate or estimate the first derivative, find its maximum point, construct the tangent line, and find its intersection with the time axis.The calculation of the first derivative for the piecewise and parametric model is straightforward since we know the formula of the respective functions.Namely, the first derivatives of the piecewise linear model parametric logistic model are However, there is neither a close form of the kernel estimate of the concentration curve, nor its derivative.In general, there are two ways to estimate the derivatives of the kernel estimate: (i) differentiate the estimate of the function, (ii) estimate directly the derivative of the function.
The former way can be divided into two options.Namely, we can calculate the central differences of the estimated function, or we can use the derivative of the kernel estimate [28].If the kernel estimate is written in its general form as , where the derivative is taken with respect to t.The latter way of estimating the derivatives consists of two approaches.The first one is based on the generalised version of Eq. ( 2) If the values b 0 , b 1 , . . ., b R minimize this equation, then the estimated r th derivative of F(t) is given by F (r) (t) = r!b r for r ≤ R [29,30].The other approach is based on calculating the regression function not on the data themselves, but on their symmetric differences c (1) where m is the positive integer [31][32][33].

Results and Discussion
In this section, a description of modelling the discretely measured continuous data and subsequent calculation of the lag-time is provided.

Concentration Curve Estimates
Estimates of the piecewise linear and logistic models and their parameters were found by minimization of the squared (vertical) distance between the model and the data using the Matlab function fminsearch, which is based on the Nelder-Mead algorithm for finding minima of a function of several variables.The bandwidth of the nonparametric model was calculated using the R function npreg (from the library np).
The proposed methods of modelling the discretely measured data are presented on two data sets-cyclohexylamine and pyridine-here denoted as data A and data B, respectively.In Figs 2 and 3, all the proposed estimates are displayed: the piecewise linear model in blue, the parametric model based on the logistic curve in orange and the nonparametric model represented by the kernel regression in green.It can be easily seen that the nonparametric model gives estimates which are closer to the data than the parametric-the logistic and piecewise linear-models.As it is obvious from the graphs in Fig. 2, for data A, all of the proposed estimates give reasonable curves describing the behaviour of the concentration over time.On the other hand, only the nonparametric model can be assessed as appropriate for the data set B (Fig. 3).Even though the kernel regression model does not have a compact equation, it has a great advantage; this model does not assume the shape of the estimated function beforehand.It is said that the nonparametric models let the data speak for themselves.That makes them more intuitive and flexible enough to grasp subtle aspects of the modelled data.
Focusing on disadvantages of the proposed methods, it needs to be pointed out that the piecewise linear model as well as the logistic model suffer from a prescribed shape of the function-what if the shape is wrongly assumed?The parametric methods are less robust than nonparametric ones.Although the nonparametric models, in which the suggested kernel smoothing belongs to, are more robust than the parametric models, their computing is slightly difficult.Also the prediction of these models is a more complicated task than the prediction in parametric settings.
In Tab. 2, the pros and cons of the proposed models are briefly summarized regarding their practical application.

Model
Pros (+) Cons (−) Linear easy interpretation, predefined shape, quick evaluation number of break points Logistic exact equation, predefined shape, easy interpretation nonlinearity in parameters Kernel flexibility, bandwidth selection, data speak for themselves no closed functional form

Lag-time Estimates
The next step in our analysis is to provide a procedure to estimate the lag-time value for which the model of the first derivative is needed.The parametric models have the straightforward calculation of their derivatives, see Eq. ( 3).The estimates of the first derivative of the kernel regression model were calculated using the central differences and the R functions npreg, npregfast and npregderiv.
In Figs 5 and 6, central differences of the raw data with the kernel regression estimates based on these differences (both in black) are displayed.In the lower panels of the figures, there are depicted the resulting estimates of the first derivatives using the approximation formula F ′ (t) = b 1 , see Eq. ( 4) (in violet), and the direct estimate of the first derivative (in light blue).kernel regression on central differences (kernel 1, black), estimate using the approximation formula (kernel 2, violet) and the direct estimate (kernel 3, blue) All methods provide similar results of the first derivative estimate.Kernel regression based on the central differences (denoted as 'kernel 1') gives a slightly undersmoothed estimate, as we can see in Fig. 5. On the other hand, the direct estimate (kernel 3) is influenced by the so-called boundary effects, as we can see in Fig. 6.The estimate using the approximation formula (kernel 2) provides a shorter estimate, because it uses an additional smoothing parameter, which shows how many data points are used from each side of the estimation point [30].Therefore, the whole estimated function is shorter by twice the size of this smoothing parameter.
Having obtained the estimates of the first derivative, we can continue with the lagtime calculations.We construct a tangent line to the concentration curve at its inflection point t 0 using the standard formula c = c(t 0 ) + c ′ (t 0 ) • (t − t 0 ).Then, we calculate its intersection with the time axis to obtain the lag-time value.In Tab. 3, the final equations of the tangent lines and their intersections with the time axis are summarized for data sets A and B.
From the graphs of the central differences (left upper panels of Figs 5 and 6), it is clear that the derivative estimates for data A will be more consistent than for data B.
Again, having 24 distinct measurements of the concentration with respect to time for each of the chemical test substance, we can compare the lag-time values obtained from our calculations with those estimated manually by an expert.Boxplots in Fig. 7 and their summary statistics in Tab. 4 provide a visual and numerical comparison of the lag-times.kernel regression on central differences (kernel 1, black), estimate using the approximation formula (kernel 2, violet) and the direct estimate (kernel 3, blue) From the boxplots and their values, it is obvious that only the linear model is different from other models.It is also supported by the analysis of variance with the post hoc Tukey's test, which shows that the only significantly different group is the lag-time estimated from the linear model.Moreover, we can see that the resulting kernel derivative estimates are close to the ones estimated manually by an expert.
The lag-time value reading is used to quickly and easily determine the value of the protection time of the barrier material used in field conditions.It should be noted that the evaluation of barrier properties in field conditions is different from the evaluation in laboratory conditions.For the purpose of rapid and relatively accurate evaluation of the Tab. 3  protective time of the evaluated barrier material, lag-time methods appear to be very reliable.
In addition, it appears that the difference between the values required by the respective standards makes no practical difference in cases where there is massive penetration of the test chemical through the barrier material under test in short times.In these cases, the lag-time readout method can be compared in terms of its factual relevance to methods based on achieving standard values for permeation masses and rates.

Outline of an Algorithm
From Fig. 7 and Tab. 4, it can be seen that the non-parametric estimates of the lag-time give similar results to those measured by an expert, i.e., those designated manually by clicking on the graph to enter a line that is tangent to the linear part of the graph.
As one of the objectives of this article was to provide a basis for an automation of the measuring procedure so that it is not necessary to determine values from the permeation curve manually, the steps of the proposed procedure for modelling the concentration and estimating the lag-time are summarized below.
1. Given the discretely measured concentration with respect to time, select the model and calculate the estimate of the concentration curve.2. Calculate the first derivative of the concentration curve and find its maximum point (i.e., the inflection point of the concentration curve).
DOI 10.3849/aimt.01826 3. Construct the tangent line at the inflection point and find its intersection with the time axis, which produces the lag-time value.As we can see, the kernel regression gives the best results in both phases of the estimation process.It fits the measured concentration the best from all proposed models (see the coefficients of determination in Fig. 4).Also, the resulting lag-time calculation has the smallest variance when compared to the parametric models (see Fig. 7).Although the nonparametric approach might seem more complicated than the parametric one, the consumption of the computer time is more or less the same as for the parametric models.

Conclusion
Discretely measured continuous data bring a task how to reconstruct the continuous function describing the variable of our interest.In this article, we presented a statistical method of reconstructing the concentration curve of the chemical substances with respect to time and proposed a mathematical procedure how to calculate the lag-time values.We focused on modelling the chemical substances which form ionic solutions in water, thus they are able to undergo dissociation.
The proposed smoothing methods can be applied to the substances with concentration determined from a different measured characteristics, such as methods based on the detection of changes in frequency, activity of the concentration of radioactive particles, Raman spectra, concentration of the substance based on the occurrence of specific peaks, etc.
As seen from the results, the linear regression was not a suitable tool for flexible modelling; therefore, it was convenient to employ more sophisticated methods, either parametric-represented here by the logistic model-or the nonparametric models.The kernel estimate was proposed as an alternative to the piecewise linear model and the logistic model.Since the article was inspired by real data sets, namely cyclohexylamine and pyridine, the presented models were tested on these data sets.The nonparametric model gave the best results of fitting the course of the concentration of the chemical substance over time.Also the estimates of the lag-time values based on the kernel estimates were the closest ones to the results obtained manually by an expert using the KONDUKTOTEST device.
Having a continuous model of the discretely measured data allows estimating other characteristics of the data themselves, such as a normalized break through time, which is connected to the concentration curve, or the lag-time, which is connected to the derivative of the concentration curve.The next step would be an automation of the measuring procedure, where one can, after the proposed steps of smoothing and calculating, read values of selected characteristics.

Fig. 1
Fig. 1 Typical course of the concentration curve in time

Fig. 4
Fig. 4 Boxplots of the coefficients of determination for data A (left panel) and data B (right panel)

Fig. 5
Fig.5Estimates of the first derivative of data A central differences (data, black +), kernel regression on central differences (kernel 1, black), estimate using the approximation formula (kernel 2, violet) and the direct estimate (kernel 3, blue)

Fig. 6
Fig.6Estimates of the first derivative of data B central differences (data, black +), kernel regression on central differences (kernel 1, black), estimate using the approximation formula (kernel 2, violet) and the direct estimate (kernel 3, blue) 10.3849/aimt.01826Tab. 1 Boxplot summary statistics of the coefficient of determination Tangent lines and lag-time values for data sets A and B Boxplots of the lag-times [min] for data A (left panel) and data B (right panel) Tab. 4 Boxplot summary statistics of the lag-time values [min]